VARIATIONAL PRINCIPLES 
in 


CLASSICAL MECHANICS 


SECOND EDITION 


VARIATIONAL PRINCIPLES 
IN 
CLASSICAL MECHANICS 


SECOND EDITION 


Douglas Cline 


University of Rochester 


24 November 2018 


©2018, 2017 by Douglas Cline 


ISBN: 978-0-9988372-6-0 e-book (Adobe PDF) 
ISBN: 978-0-9988372-7-7 print (Paperback) 


Variational Principles in Classical Mechanics, 2" edition 


Contributors 
Author: Douglas Cline 
Illustrator: Meghan Sarkis 


Published by University of Rochester River Campus Libraries 
University of Rochester 
Rochester, NY 14627 


AJOS 


BY NC SA 


Variational Principles in Classical Mechanics, 2" edition by Douglas Cline is licensed under a Creative 
Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0), except 
where otherwise noted. 


You are free to: 


e Share — copy or redistribute the material in any medium or format. 


e Adapt — remix, transform, and build upon the material. 


Under the following terms: 


Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes 
were made. You must do so in any reasonable manner, but not in any way that suggests the licensor 
endorses you or your use. 


NonCommercial — You may not use the material for commercial purposes. 


ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions 
under the same license as the original. 


No additional restrictions — You may not apply legal terms or technological measures that legally 
restrict others from doing anything the license permits. 


The licensor cannot revoke these freedoms as long as you follow the license terms. 


Version 2.0 


Contents 


Contents iii 
Preface xvii 
Prologue xix 
1 A brief history of classical mechanics 1 
Tel - Introduction reihe 4. ein ee ie BR Ph eee ek eh eee % 2 1 
ka -Greek-antiquity ata bree oa? dls e a RR RR ite ERG tte Gt da. G 1 
TS Middle Ages cima eh ee E AE G OA 2 
1.4 Age of Enlightenment... 2 
1.5 Variational methods in physics .......... 0.000000. 2 5 
1.6 The 20%” century revolution in physics . . 2... ee ee 7 

2 Review of Newtonian mechanics 9 
251 Tatrodučtiónis Saara g ek ae A A ee Bed ee A eee bt Ee ih 9 
2.2 Newton’s Laws of motion ... aoaaa aa 9 
2.3 Inertial frames of reference. ...... ee 10 
2.4 First-order integrals in Newtonian mechanics ............ 0.000.000 0000004 11 
24:1- Limés Momentum a ie eee wa ee PB ee ie GR ah hehe Goa ee Y 11 

2.42 Angular mometitum s e s 6 6 6 pe ee a 11 

ZA: Kinetic 6nergys iiae dk ee ek ee. gts LR od Ee Be A hc tS 12 

2.5 Conservation laws in classical Mechanics ........ oo... o... 12 
2.6 Motion of finite-sized and many-body systems ............ o... 12 
2.7 Center of mass of a many-body system ....... ee 13 
2.8 Total linear momentum of a many-body system ........... o... 14 
2.8.1 Center-of-mass decomposition... 2... 0. a 14 

2.8.2. Equations of’motion: > esee secs s s ka a Ree ee be ee PE eee eh RES 14 

2.9 Angular momentum of a many-body system... .. 2.0.0.0... 0. ee ee eee 16 
2.9.1 Center-of-mass decomposition... aooo 16 

29:2- ~ Fiquations: OF MOTO 4: ses a he econ eee me a ae ee Bw 16 

2.10 Work and kinetic energy for a many-body system ........ o... .. e... 18 
2.10.1 Center-of-mass kinetic energy .............. 18 

2.10.2 Conservative forces and potential energy ..... o... o... ee ee 18 

210:3 Totál mechanical energy: a sni e de had A ae Ge SE, A ea ee 19 

2.10.4 Total mechanical energy for conservative systems... 2... 0.0.20. ee eee 20 

211 Vimal Theorem. a rs AA ee A A Rot Bi ee oe eG hes 22 
2.12 Applications of Newton’s equations of motion ...........0.000 000 eee eee 24 
2.12.1 Constant force problems... 2... 2... aab ea e a a a ee 24 

2.12.2. Linear Restoring Force’... ta a es a A Se ae al A dad a aoa 25 

2.12.3 Position-dependent conservative forces... ooo 25 

212A. Constrained Motion: soiien bd ko a Ge eS ee ee yeaa Ay Sates ehh 27 

2.12.5 Velocity Dependent Forces. .................. ee ee 28 

2.12.6 Systems with Variable Mass . ........... o... e... 29 

2.12.7 Rigid-body rotation about a body-fixed rotation axis . .... o... ....... 31 


111 


CONTENTS 


2.12.8 Time dependent forces... ooo ee 34 
2.13 Solution of many-body equations of motion .. ooo ee ee 37 
2.13.1 Analytic solution ......... a a e aa E E A a y a eie E e 37 
2.13.2 Successive approximation .. aooaa ee 37 
2.13.3, Perturbation method is < dopo eA ea ee a oS ee kG 37 
2.14 Newton’s Law of Gravitation ....... a a 38 
2.14.1 Gravitational and inertial mass . . . 2... 1. a 38 
2.14.2 Gravitational potential energy U ......... ee 39 
2.14.3 Gravitational potential dé .. aooaa aaa ee 40 
2.14:4. Potential theory 20 eg yo ee SO A GO Be ete E a e a 41 
2.14.5 Curl of the gravitational field ... oioi i o... o... e... 41 
2.14.6 Gauss’s Law for Gravitation ....... ee 43 
2.14.7 Condensed forms of Newton’s Law of Gravitation... ooa oaa 44 
QAO UTI AT YE 1a oe Bas oases deen aes AN 46 
Workshopsexercises. << m 6 uh bk ee ek Ge ed oa Se A ee ite a RG 48 
Problems oh esis Ba ea a ted et Shee hi Be ee A eee Bk Goda, Ge ad 51 
Linear oscillators 53 
3D. Antroduction:®, gen fas bebe lie Pee Ae a D D EAR ENE Re epee a A ae ve Sb A 53 
3.2 Linear restoring forces ....... ea r ea E a a a EE E 53 
3.3 Linearity and superposition ....... e ee 54 
3.4 Geometrical representations of dynamical motion .......... e a 55 
3.4.1 Configuration space (qi, qj, t) <- c o ooa e... 55 
3.4.2 State space, (Ji got) eae eaa e a ae eee eaa a a e A a r aa E a a 56 
3:4:3- Phase:space, QisPit): oa manina Be e ee eR ERRORS tt A AA 56 
344, “Plane-pendultim: e ecos bE ee ee be ee Ed Soe ee he GS 57 
3.5 Linearly-damped free linear oscillator. 2... ee 58 
3.51. ¡General solution: a 44.5. geek A Be oe ee ee ae 58 
3.0.2, ¿Energy dissipation: eta ni a ah el AA 61 
3.6 Sinusoidally-drive, linearly-damped, linear oscillator ............ o... o... .. 62 
3.6.1 Transient response of a driven oscillator ................ so. 62 
3.6.2 Steady state response of a driven oscillator ......... 0... o... ....... 63 
3.6.3 Complete solution of the driven oscillator ................. ..... .. ... 64 
3:6-4 Resonance s nta e a a e e a 65 
3:6:5; ¿Energy absorptions cut atado kal aa II ean ra a a De Sb 65 
33 Wavesequationy «eta loa dad As e tas de aa 68 
3.8 Travelling and standing wave solutions of the wave equation. ..... o... . e... . 69 
3.9: Wavelorm:analysis ela ra e Gag ae ae a e fhe SS eG 70 
3.9.1 Harmonic decomposition... 2... a 70 
3.9.2 The free linearly-damped linear oscillator . 2... 0.0.0.0... 0200002200040. 70 
3.9.3 Damped linear oscillator subject to an arbitrary periodic force ............. 71 
3.10: Signal processing Tiis et ay keh a te Perec ete a eee a a BO 72 
IL Wave propagation: a Saves es io A E Eee Re ee ee tee ere Ses 73 
3.11.1 Phase, group, and signal velocities of wave packets . .... o... o... a 74 
3.11.2 Fourier transform of wave packets ....... o... a 79 
3.11.3 Wave-packet Uncertainty Principle... 2... 0.0.00... 0020000200000. 80 
BAD ASUMA a A RR eye and al BEC etn Ah ee eats See aces abe (on ti tekstas in he la 82 
Workshop exercises... 85 
Problems}: if:2 205 hb ad a NR Ma e A 88 
Nonlinear systems and chaos 89 
Al. “Introductions is a Gee i AE A eee A eee eS 89 
4,2; Weak nonlinearity: + pan oP as Ge a Bh Gok Rs ae a de Deo SO 90 
4.3 Bifurcation, and point attractors ........ ee 92 


did Limit cycles Dea ns. a ig ee Pte AA A ce Ph Be he ee ee We 93 


CONTENTS 


4.4.1 Poincaré-Bendixson theorem .... 2... 0. 
4.4.2 van der Pol damped harmonic oscillator: . ............. ee 
4.5 Harmonically-driven, linearly-damped, plane pendulum. ...................4.- 
AS. + Close to linearityi:-%. Sass ie 2 a Gs a ds Syne. a) da Ena do 
4.5.2 Weak nonlinearity ....................... ee ee ee 
4.5.3 Onset of complication ....... 2 ee 
4.5.4 Period doubling and bifurcation... canr ra a ne eae E ee 
45:5 “Rolling; Motion’: mutter ae a ee ee a eee we BO He ie des 
A:0:6- “Onset of Chos iioi bk fio debe & eae kh A Gok PG DA oe bo BLAS 
4.6 Differentiation between ordered and chaotic motion... aoo a a a 000050 
4.6.1 Lyapunov exponent .... se ats ial a e en ase aa ee 
4.6.2 Bifurcation diagrama... 
4.6.3 Poincaré Section .. aos e 2... ee 
4.7 Wave propagation for non-linear systems... o.oo ee 
4.7.1 Phase, group, and signal velocities . .. ooa ee 
4.7.2 Soliton wave propagation .. . ooo a 
¿RS YD, toate Sass E AN 
Workshop: exercises mah Re Pe SE BG Sa A RR oo a eria 
Problems) fence A et PG O A he SS A AAA ig one 
5 Calculus of variations 
A ns inma ce Gray? ap ee eae ee Ee De Beek eee ae el bate be 8 
5.2 Euler’s differential equation ...... aaa 
5.3 Applications of Euler’s equati0........ 
5.4 Selection of the independent variable . .......... o... o... 
5.5 Functions with several independent variables y;(x) .... o... o... ooo... oo... 
5.6 -Euler’s Integral equation. 4.0 hve ak A a ee A 
5.7 Constrained variational systems > s so dia w rra i e e a e aa a e e a 
5.7.1 Holonomic constraints ...... E a a ea a a a e e a e e E 
5.7.2 Geometric (algebraic) equations of constraint . . . o.oo oa a 
5.7.3 Kinematic (differential) equations of constraint ...... ooo a 
5.7.4 Isoperimetric (integral) equations of constraint ...... o... a 
5.7.5 Properties of the constraint equations ......... a ee 
5.7.6 ‘Treatment of constraint forces in variational calculus... ..............04. 
5.8 Generalized coordinates in variational calculus ....... o... o... e... 
5.9 Lagrange multipliers for holonomic constraints ......... e... e... eee eee eee 
5.9.1 Algebraic equations of constraint s a sa sot es mae ee ee 
5.9.2 Integral equations of CoNstraiMb .......... a ee 
ELO Geodesia A yee hh eae ae Ge eee eek es ne e 
5.11 Variational approach to classical mechanics .............. e... 
ARS A A Nae hes ali Bese ee. Lb AN 
Workshop exercises... 24. ec ee ed ee a 
Problems. ¿Aa Se Pe ae AR, e bE Be E eee AAA gars 
6 Lagrangian dynamics 
6:1) Introduction. 02 A a BE Peds BE aa 4 Paes 2 had 
6.2 Newtonian plausibility argument for Lagrangian mechanics ................0.. 
6.3 Lagrange equations from d’Alembert’s Principle... ........... 0.20000 0000. 
6.3.1 d’Alembert’s Principle of Virtual Work ..........0... 002000020000. 
6.3.2 Transformation to generalized coordinates... .......0 0.00.00 
6:3:3 -Lagrangian it cei fig. nats Pak ys eed oR RN 
6.4 Lagrange equations from Hamilton’s Action Principle ......... o... o... ..... 
6:5. Constrained Systems) mt a e a i Paes 
6.5.1 Choice of generalized coordinates ........... e... 


6.5.2 Minimal set of generalized coordinates ......... o... ee 


CONTENTS 


6.5.3 Lagrange multipliers approach ............ 142 
6.5.4 Generalized forces approach... sooo a 144 
6.6 Applying the Euler-Lagrange equations to classical mechanics ..............---.% 144 
6.7 Applications to unconstrained systems ...... 00 0 146 
6.8 Applications to systems involving holonomic constraints . . . . o.oo 148 
6.9 Applications involving non-holonomic constraints ...........0.2... 0002022 e 161 
6.10 Velocity-dependent Lorentz force ......... a 168 
6.11 Time-dependent forces... 2... 0. 169 
6.12 Ianpulsive: forces: a ma aue ra eo Poe AA ee oe be a, ea ee 170 
6.13 The Lagrangian versus the Newtonian approach to classical mechanics ............. 172 
64 SUMA e ha ca BE A A A A a heel Me ea ere e ye Boy 173 
Workshop exercises: use ta e de Mode eet eh eek e A eid ele Mae ew ie Go te da do 176 
Problems: as Tar a A e A ee E ee ls ea 178 
Symmetries, Invariance and the Hamiltonian 179 
Tal”, ‘Tntroducti tive rk coat att te eth erie “be A eat Cae ed he a leek e aM ds eee 179 
1,2 Generalized momentum: cs ee A Bea ke eo oe ee Ae ee eh ae Se 179 
7.3 Invariant transformations and Noether’s Theorem... ............ 000220 181 
7.4 Rotational invariance and conservation of angular momentum ................0.. 183 
Ta Cyclic coordinates: ee roere WE L Eoi kap a eh ee 184 
7.6 Kinetic energy in generalized coordinates .............. a 185 
7.7 Generalized energy and the Hamiltonian function . . . o aooaa 0.00000 00004 186 
7.8 Generalized energy theorem ..... ee 187 
7.9 Generalized energy and total energy .. ooa ee 187 
7.10 Hamiltonian variante... 188 
7.11 Hamiltonian for cyclic coordinates ....... pa ee 193 
7.12 Symmetries and invariance .. 1... y aE a e E E a N a a A g aA 193 
7.13 Hamiltonian in classical mechanics .......... e 193 
TAII PURT a do Be a eed EM OBA wt BS DA eek See es 194 
‘Workshop exercisés: i e e aora ida AA eee ae ea PER ee Sas 196 
Problems Ste cave: Oa ean arta tne th AG eis Aeon eee, her AAA tee 197 
Hamiltonian mechanics 199 
8,1 Introductions: A a 2 s- 5s. nck & ieee Oe Sp al AA A eh ee E eS 199 
8.2 Legendre Transformation between Lagrangian and Hamiltonian mechanics........... 200 
8.3 Hamilton’s equations of motion .... oaoa 201 
8.3.1 Canonical equations of motion ....... 000002 ee ee 202 
8.4 Hamiltonian in different coordinate systems .. 1... 0.0.0.0 00 ee 203 
8.4.1 Cylindrical coordinates p,z,@ . a e r e ae e a a a ee 203 
8.4.2 Spherical coordinates, r,0,@ . . o ooo ee 204 
8.5 Applications of Hamiltonian Dynamics ........ 000000 eee eee eee 205 
8.6- Routkian: redúction o aik eng a e Re Se GUE ep did oe e 210 
8.6.1 Reyclic - Routhian is a Hamiltonian for the cyclic variables... . ooa 211 
8.6.2  Rnoncyclic - Routhian is a Hamiltonian for the non-cyclic variables ........... 212 
8.1 MVariable-mass-systems: Tori sisid A aoe ike eae eee wa Re Sie Fe eee 216 
8:41 “Rocket:proptilsion: s mia 44 aa 24204 ca we ela ae ao aoe Gok Sed Ag 216 
Site2, ‘Moving: chains EIA ie 8. SO woes Le ae ee ee ed aE 217 
SIS OU Se aye ohh GG ae onsen bab hte ghd Ga a Seite AL Ay aye Yh eae oh Be 219 
Workshop exercises. moriscos is ie ee te Sod Sa ees de Se ee eb 221 
Problems: oseni area cece ek Bd Gy eR oe ee a eh eel Tt eee to oe 222 
Hamilton’s Action Principle 225 
9A! Wntroductioniy: 2.28 GP arises el als ase BS be Go Men ee D ee a eerie eke Bee G 225 
9.2 Hamilton’s Principle of Stationary Action 
9.2.1 Stationary-action principle in Lagrangian mechanics ................... 226 


9.2.2 Stationary-action principle in Hamiltonian mechanics .................. 227 


CONTENTS vii 


10 


11 


12 


9:2:3 “Abbréviatedvaction: di ent Sach Oh ee Bae eee Eek he a 228 
9.2.4 Hamilton's Principle applied using initial boundary conditions ............. 229 

93. Tia@rangiam” Mii a & Mea ee a A dad ath A Bek. ca ue Sea o aa 232 
9.3.1 Standard Lagrangian... 2... a a a e a a a a aede a i ak ai 232 
9.3.2 Gauge invariance of the standard Lagrangian . . . o ooa a 232 
9.3.3 Non-standard Lagrangians... yaoa a esi a a e ee 234 
9.3.4 Inverse variational calculus . .. aoaaa 234 

9.4 Application of Hamilton’s Action Principle to mechanics . . . . o.o a o... .... 235 
9797 SUMMAI aces Ka wa alte, oe, Ba ee a AE We ae ee, ee a 236 
Nonconservative systems 239 
10:1 ‘Introductions. a saataneen ee ee Ped ASE EE a a ee a i 239 
10.2 Origins of nonconservative motion ....... a 239 
10.3 Algebraic mechanics for nonconservative systems ..... o... ee 240 
10.4 Rayleigh’s dissipation function . . ooo 240 
10.4.1 Generalized dissipative forces for linear velocity dependence ..............-. 241 
10.4.2 Generalized dissipative forces for nonlinear velocity dependence. ............ 242 
10.4.3 Lagrange equations of Motion sisa st ciro orae rry mena a A e 242 
10.4.4 Hamiltonian mechanics ...... a epe e e a Da e a o e e a 242 

10.5 Dissipative Lagrangians ..... 245 
10.6 Summary mose a A E et SANS E E E ic Te e DE oh ah 247 
Conservative two-body central forces 249 
TL Introduction ii. a ta ae ee ee eee oe EA SAAS A e ee 249 
11.2 Equivalent one-body representation for two-body motion. ................0004 250 
11.3Angular:momentum E coo ee REE AG a eee ee a ee PE 252 
TLA Biquationsof motion, vat: o A AA Re oe ee Gp Be ee A 253 
11.5 Differential orbit equation:. . . ooa ee 254 
116: Hamiltonia id to tio ess we Be ale A a ee bt SUS A a ce 255 
11.7 General features of the orbit solutions .......... o... e... 256 
11.8 Inverse-square, two-body, central force . . oaoa a 257 
11:81. “Bound Orbits is aima ar he Re be eh be et Eee bd eo 258 
11.8.2 Kepler’s laws for bound planetary motion ............. 000000020004 e 259 
11:3:3 Unbound Orbits: i s mai dee ee ek BA RR led 260 
11.8.4 Eccentricity vector ............ e... 261 

11.9 Isotropic, linear, two-body, central force .. 2... 2. a 263 
11:91. Polar coordinates... sse conie ha A a es 264 
11.9.2 Cartesian coordinates . 2... 0... ee 265 
11.9.3 Symmetry tensor Au ieee e a ee eb eee Pb ee a ed 266 
1110 Closed-orbit.stabilitiy: pito eal A oe a Be PR a ee ee Slee ge Eee 267 
11.11 The three-body problem ste eos sii cee het eB Oe ee hele ee So de 272 
11,12T wo-body scattering vom eh ew RS ER we RAG Be eo ee g 273 
11.12.1 Total two-body scattering cross section ......... o... .... ee 273 
11.12.2 Differential two-body scattering cross section .. 2... ooo ee 274 
11.12.3 Impact parameter dependence on scattering angle ..................0.. 274 
11.12.4Rutherford scattering 2... a ee 276 
11.13 Fwo-body. kinematics: nnn y espe A PR Boe Oe a E ae eed Be 278 
IRA UM IMAT ea a A is Bod he ad e E fe da eae A Ae 284 
Workshop exercises. moro do hs ie hee te Gud SA ees de a ee ae eee i 286 
Problems? uo a6. 22 e eh ad Gy ee e ee a ee el Ti eee 287 
Non-inertial reference frames 289 
12:17 Introduction tae A e Sls Ma ek ete eR Gh Bt eee ah Me G 289 
12.2 Translational acceleration of a reference frame... oaoa a a ee 289 
12.3 Rotating reference frame... 290 


12.3.1 Spatial time derivatives in a rotating, non-translating, reference frame ......... 290 


viii 


13 


14 


CONTENTS 

12.3.2 General vector in a rotating, non-translating, reference frame .............. 291 

12.4 Reference frame undergoing rotation plus translation . ...... o... o... e... ... 292 
12.5 Newton’s law of motion in a non-inertial frame ...... o... a 292 
12.6 Lagrangian mechanics in a non-inertial frame . . . 2... o... e... 293 
1227 Centrifugal! force: o ate Basie ia A ob ee Bo AA Doe ee ed a 294 
12.8" Coriolis TOTE 2.4 sei AE oia ee Regs eB we Aa oe ee Be fe SS 295 
12.9 Routhian reduction for rotating systems ............... e... 299 
12.10Effective gravitational force near the surface of the Earth ................004. 302 
12.11Free motion on the earth. ...................... ee 304 
12:12 Weather systems: 2 iie nei ioi Gee ee DO A GO Be ee PE ee Aa 306 
12.12.1 Low-pressure systems: ...... ee 306 
12.12.2 High-pressure systems: ....... e... 308 
12.13Foucault pendulum . s dirt. goa oe da ee ee ee a 308 
12 A SUMMARY, et cheats. Pek he ate ie ae A oe ust Oust ro tee eS A, a 310 
Workshop: exercises.«: ie ah a ee ek be Ge ced oes Oe BE ade ete bE EASE A he 311 
Problems aa A ea ase, ek ie hae Bl Pk Re eee A ihe, Be T 312 
Rigid-body rotation 313 
13:1) Introduction: e a a A le Ae ee Ba Ge RE EB ep d Sa hee God ee 313 
13.2 Rigid-body coordinates......... a 314 
13.3 Rigid-body rotation about a body-fixed point ...........0.. 202.0000 22 ee 314 
13.4 ACES tensor ine eG alanis Wee A ed a a See Oe ee Ee e Ai 316 
13.5 Matrix and tensor formulations of rigid-body rotation ...... 0... o... e... .... 317 
13.6. -Principal-axis.Systemi:) site bhen a A Gk EP ED A a S 317 
13.7 Diagonalize the inertia tensor .................... e... e... 318 
13.8-Parallel-axis: theorem i e e mess eR ew ee a bb 319 
13.9 Perpendicular-axis theorem for plane laminae ...........2..0.0 000002 eee eee 322 
13.10General properties of the inertia tenso... 323 
13.10.1 Inertial equivalence ........... ee 323 
13.10.2 Orthogonality of principal axés. e read e... 324 
13.11Angular momentum L and angular velocity w vectors .. 1... 2.0.0.0... 000 ee eee 325 
13.12Kinetic energy of rotating rigid body .......... o... ee... 327 
13,13 Euler angles: si: hiba peng eo a ERS E A A A A RR 329 
13:14 Angular velocity Wo. aaae ne Pe A ees 331 
13.15Kinetic energy in terms of Euler angular velocities .. 2... 2... 2. o... e... ..... 332 
13-16 Rotational invariants: ts hs) sor kh ee ead a Goal God done at Malad a dde da a ee eR 333 
13.17Euler’s equations of motion for rigid-body rotation .. 2... 2... 0.2.0.0... 0000008 334 
13.18Lagrange equations of motion for rigid-body rotation ..... o... o... e... 00004 335 
13.19Hamiltonian equations of motion for rigid-body rotation ......... o... e... e... 337 
13.20 Torque-free rotation of an inertially-symmetric rigid rotor ...... o... o... o... e... 337 
13.20.1 Euler’s equations of motion: ........ ee 337 
13.20.2 Lagrange equations of motion: ....... ee 341 
13.21 Torque-free rotation of an asymmetric rigid rotor . . . ooo o... e... 343 
13.22Stability of torque-free rotation of an asymmetric body. ..... o... o... e... e... 344 
13.23Symmetric rigid rotor subject to torque about a fixed point ................0.. 347 
13:24 The rolling wheels canina ta BEE A ES EA ROA 351 
13.25Dynamic balancing of wheels ......... o... 355 
13.26Rotation of deformable bodies . ......... o... e... 356 
¡SS o AN A RR AAR De AN, Be tees 397 
Workshop exercises: 28 AA A A AAA E ANA he Pa ee SE 359 
Problems rana ai as Gi oe rl ioe ch diane A a bo AA A dere ro do 362 
Coupled linear oscillators 363 
14,1 ¡Introduction e amo ie Gs, bets e A ee PS ao ce Ghat, Bday are a A A ee ek Se de id 363 


14.2 Two coupled linear oscillators . . . 2... . ee 363 


CONTENTS ix 


15 


14:3: Normalimodes” ii Brie Se BR NO OE ae a GS GL Oa o a A 365 
14.4 Center of mass oscillations... 2... ee 366 
14:5- Weak coupling; vaa Be ee a a ed ee a we a 367 
14.6 General analytic theory for coupled linear oscillators .. 2... .. 0.0.0.0... . 0000088 369 
14.6.1 Kinetic energy tensor TT... 2... .. ee e 369 
14.6.2 Potential energy tensor V ............ o... ee e... 370 
146:3- Equations Ol Motion” sis har ae a a ada e da da eee 371 
14:6.4. Superposition: muaa BR a we PP bee a ae a a 372 
14.6.5 Eigenfunction orthonormality ................. ee 372 
14:6:6. Normal coordinates: «dirias EAS A AA 373 
14.7 Two-body coupled oscillator systems ............. 374 
14.8 Three-body coupled linear oscillator systems... ooa ee 380 
14.9 Molecular coupled oscillator systems . . . ooa 385 
14.10Discrete Lattice Chain ........ ee 388 
14.10.1 Longitudinal motion... 388 
14.10.2 Transverse motion ........ oo... ...... e... e. a aae 388 
14 10:37" Normal modes iiaea E A AS E E ES A E 389 
14:10:4 Travelling: Waves ram e ea dl ae Wes Be Oe A R a e a 392 
141005 DISperSiO nia ta A Ae Boe eos aes Pea ae Ye RS de e aa 392 
14.10.6 Complex wavenumber .... 2... 0... ee ee 393 
14.11Damped coupled linear oscillators... 2... o. e... 394 
14.12Collective synchronization of coupled oscillators . ........... ooo... e... e... .. 395 
TAL ISUMMAY 343 es a rl e ra A a e ed bh 399 
Workshop EXETCISES LL at a e Pao A a la da AA D a 401 
Problems e E A A tt a BG 402 
Advanced Hamiltonian mechanics 403 
15,1 Introduction A Sa eee eg Be ee eh eS 403 
15.2 Poisson bracket representation of Hamiltonian mechanics ................20-. 405 
T5:2:1, Poisson Brackets: sis ea saniat 6 5 RA REP AE A oa ee eS Re 405 
15.2.2 Fundamental Poisson brackets: .. . ooa a 405 
15.2.3 Poisson bracket invariance to canonical transformations ................. 406 
15.2.4 Correspondence of the commutator and the Poisson Bracket. .............. 407 
15.2.5 Observables in Hamiltonian mechanics... .........0020 2000000 eee eee 408 
15.2.6 Hamilton’s equations of motion... s osoo ee 411 
15.2.7 Liouville’s Theorem selema aare Eae a ae aa a a ee 415 
15.3 Canonical transformations in Hamiltonian mechanics ............. o... 417 
15.3.1 ‘Generating Tunctions eo a ee A a a ee NY e ee ee 418 
15.3.2 Applications of canonical transformations . . . . ooo o... 420 
15.4 Hamilton-Jacobi theory ........ 422 
15.4.1 Time-dependent Hamiltonian .. . saaa a 422 
15.4.2 Time-independent Hamiltonian . .. saaa aaa a 424 
15.4.3 “Separation: of variables. 2: e'i ri s a d ren BA ee aia Ge a ee a 425 
15.4.4 Visual representation of the action function S. ...................... 432 
15.4.5 Advantages of Hamilton-Jacobi theory . . . o oaoa ee 432 
15.5 Action-angle variables .............. ee 433 
15.5.1 Canonical transformation... 433 
15.5.2 Adiabatic invariance of the action variables . . . . o.oo o... ....... 436 
15.6 Canonical perturbation theory ......... DA Ea a a A G 438 
15.7 Symplectic representation .. e s ea e munro eie Eie ee 440 
15.8 Comparison of the Lagrangian and Hamiltonian formulations .................. 440 
AA Sg, ed Cyt p S E Ba fe Baek S29 ota doe E ASTA A dat ch eae a a eal 442 
Workshop exercises’... a e ee 445 


Problems venias de ts hea lec Aces en dd de do Gal O ee AE he ng Ta Be Te O ta 446 


ES CONTENTS 
16 Analytical formulations for continuous systems 447 
TOA, Introduction A A A tt di E 447 
16.2 The continuous uniform linear chain .......... 447 
16.3 The Lagrangian density formulation for continuous systems ........... e... .... 448 
16.3.1 One spatial dimension... 448 
16.3.2 Three spatial dimensions ....... a 449 

16.4 The Hamiltonian density formulation for continuous systems ..............004.4 450 
16.5; Linear elastic Solds ia bose nat oe oo We as ene in beh ne GAGA G 451 
16:5.1 "Stress tensor ada Oe he edged og Che Dai a deer etn BESS ADS e 452 
16:5:2 Strain tensor 2 4 Aue A PR ME Red eek OY ie ee Bee PEs A 452 
16:5:3". Moduli-of elasticity’? + soz. 0:54 e eR aE hot eS RS ee ee SS @ S 453 
16.5.4 Equations of motion in a uniform elastic Media ..........0.. 0000. ee nee 454 

16.6 Electromagnetic field theory... ea a a i O E E EE E a a 455 
16.6.1 ¿Maxwell Stress tensor ocorre r e da eh oe a ee eee ie Ge a Bee at ia 455 
16.6.2 Momentum in the electromagnetic field ©... ........ 000000002 eee 456 

16.7 Ideal fluid dynamics -...... d a a a aa E a a e r A k 457 
16.7.1 Continuity equation aa 6 e ea ee ee ee E A 457 
16.7.2 Euler’s hydrodynamic equation . . . 2... 2. a 457 
16.7.3 Irrotational flow and Bernoulli’s equation ............ 00020002 ee 458 
LOTA Gas HOW ss Sat teti Rin arg, SP ae aw GE PE AA aoe ees 458 

1658. Viscous: Muid dyos sa aoa ae, BES a a hea Pere Goh ie SE a a a 460 
16.8.1 Navier-Stokes equation... 2... ee 460 
16:8,2 ‘Reynolds number ci 5 a E Oe ee A Oe ae ee ee ee Ge t 461 
16.8.3 Laminar and turbulent fluid flow ...... o... .. 0002200000002 20 08. 461 

16.9 Summary and implications ....... 0... 0... 00020000002 eee ee, 463 
17 Relativistic mechanics 465 
IA Introductionya: so e hee Bis, etn, AP ee OH ce Gah do AAA ke de SS 465 
17.2 Galilean, Invariance: caca A A a A At 465 
17.3 Special Theory of Relativity . .................... e... 467 
11.31. Esmstein; Postulates eV. 26: ta. gen a lie a a aes 467 
173.2 ‘Lorentz transformation: oe eke: a eo Boe a ee a Ge A 467 
173.3 Lime Dilätion ose on eat As ed b A ee Re ee ae Bee e 468 
17.3.4 Length Contraction ..... tee p e e a a a A EE E 469 
17.320 Saner A ee Be eh PU, ek eee ee 8 469 

174. Relativistic kinematics cu ad 2% bis BA tha ea ee a ae Ee te 472 
17.4.1 Velocity transformations... 0... a a a aT a a E a 472 
T742 -Momentum > i go, da E See a E a id ds Sd 472 
17.4.3 Center of momentum coordinate system ............. 473 
PAA FORGR? anala ra a ded a bee aera ee ba 473 
TASA EDET ye e a a ee A AE BR Sarg E Ge e a ena 473 

17.5 Geometry of space-time sa ace va k e a a e E E ee e a a a G 475 
17.5.1 Four-dimensional space-time ..... a 475 
17.5.2 Four-vector scalar products .. . ooa a 476 
17.5:3.: Minkowski space-time: 6 sdi p ia E ete OE Oe ee Eee eS Pe AT7 
17.5.4 Momentum-energy four Vector... 478 

17.6 Lorentz-invariant formulation of Lagrangian mechanics ............. e... e... 479 
17.6.1 Parametric formulation ..... aaa a 479 
17.6.2 Extended Lagrangian cutis BAe a oe ela RE ea oA 479 
17.6.3 Extended generalized Mmomenta -...... ee 481 
17.6.4 Extended Lagrange equations of motion . . . soosoo o 200000 eee ee 481 

17.7 Lorentz-invariant formulations of Hamiltonian mechanics . . . a oaa a a 484 
17.7.1 Extended canonical formalisM ........... ee 484 
17.7.2 Extended Poisson Bracket representation ............ a 486 


17.7.3 Extended canonical transformation and Hamilton-Jacobi theory. ............ 486 


CONTENTS 


17.8 


17.9 


17.7.4 Validity of the extended Hamilton-Lagrange formalism. ...............0.. 
The General Theory of Relativity... 2... 0.0.0.0... 0000000000000 00000. 
17.8.1 The fundamental concepts ......... ee 
17.8.2 Einstein’s postulates for the General Theory of Relativity ................ 
17.8.3 Experimental evidence in support of the General Theory of Relativity ......... 
Implications of relativistic theory to classical mechanics ................2000. 


EP MOS UMMALY il AAA wR EA Al BA es 
Workshop: exercises’ snra a ane gee ee Ee oh A a A ee ea a a eTa A tes 
Problems: -% tarara ta o id E PE Ae ee a a Ra Be BSG 


18 The transition to quantum physics 


18.1 
18.2 


18.4 
18.5 
18.6 


Introductorio A, Sy Be AR A ee ee eG RES 
Brief summary of the origins of quantum theory ......... o... 0.000020 e. 
18.2.1 Bohr model of the atom ....... ee 
18:2 2 ~ Quantization... perl Gee oo) WOR Ak eee he a Ge eee ait ad 
18.2.3 Wave-particle duality ............................ ee 
Hamiltonian in quantum theory ........ a 
18.3.1 Heisenberg's matrix-mechanics representation... 2... 0.00020. 0 000000. 
18.3.2 Schródinger's wave-mechanics representation ...... o... o... 
Lagrangian representation in quantum theory ....... o... o... e... e... 
Correspondence Principle .......... ee 
SUMMELY 41004) ad ete bis Cle Ad Me bh N Lob aee gas doe Ba pee eee ed bb 


19 Epilogue 


Appendices 


A Matrix algebra 


Al 
A.2 
A.3 
A.A 


Mathematical methods for mechanics... . ooo o a e a 
Matrices o AER tek, ete Barer o iets of ek Pe gi e Be ee ee rae aa cok A 
Determinants: os aa A A ee ee We ce A eee Se BP 
Reduction of a matrix to diagonal form... 2. a 


B Vector algebra 


B.1 
B.2 
B.3 
B.4 


Linear operations: decre dos, ety ae, a OR Se GL ee Ea a a iia eS 
Scala Products wie see Bee EE O e ie Pe a Ae es 
Vector product: “As: sas Be eet hen A Bde Mode det the tae SP gaat ats tne te ae e 6 
Triple: products. al Oy dpe ded Be BAS SE ee ee a he GP BS 


C Orthogonal coordinate systems 


C.1 
C.2 


C.3 


Cartesian coordinates (T; Y, Z) = se ce ca caet ee 
Curvilinear coordinate Systems... 
C.2.1 Two-dimensional polar coordinates (7,0)... o a a 
C.2.2 Cylindrical Coordinates (p,¢,2) < oaa ee 
C.2.3 Spherical Coordinates (r,0,¢) 2... a 
Frenet-Serret coordinates ...... a re AOR E E E A E A a aa 


D Coordinate transformations 


D.1 
D.2 


D.3 
D.4 


Translational: ‘transformations: a A ee ea es ee be 
Rotational: transformations: -sa 6.3/4. wa eb Pa eel ay Mag eae aoa we SS h 
D:2.L “Rotation MÁX 2 a4, 2 A Sted a by hE a 
D222 -Binite: rotations =. 4.2. 5s dk ks Bo ee a Ee dl A e 
D:2:39 Anhnitessimal : rotations: inss So Ge ait Boas Se Bade Ro RANE. a ALR OR da oe 
D.2.4 Proper and improper rotations ... osoa 
Spatial inversion transformation... 
Time reversal transformation a BE eo ee 


xi 


xii 


E Tensor algebra 


Fil” E 248 4. Aae pod sth dy eee O 
E2- Tensor products tasa la ble wba, 2a gen Aw SP kee ad es 

E.2.1 Tensor outer product... ........ 0.00000 eee eee 

E.2.2 Tensor inner product. ...........0. 0.000002 ee eee 
E.3. ‘Tensor properties... ee ee ke E A 
E.4 Contravariant and covariant tensors . . . . ooo 
E.5 Generalized inner product ............02.00 00002 ee eee 
E.6 Transformation properties of observables... ............000.4 


F Aspects of multivariate calculus 


F.L Partial differentiation ........0.0.0. 000000 s roaa aia eee 
F2 Lear operators: Ls ee eee PO OP Boe A EO Re 
F.3 Transformation Jacobian e mons useine e... 


F.3.3 Properties of the Jacobian: . . . ooo a 
F.4 Legendre transformation . .. a aoaaa a a ee eee 


G Vector differential calculus 


G.1 Scalar differential operators .. . . ooo a 
Gl: Scalar field. ii e Ge eed oa d 
G1:2: Vector field. cocoa ae a be a 

G.2 Vector differential operators in cartesian coordinates ........... 
G2 Scalar field) de a tea e a le ee ee BU a e aaa 
G22: Vector held: 2 ue Re ee ee BG ee 

G.3 Vector differential operators in curvilinear coordinates .......... 
GA (Gradient? A A BA Ge ah ee RS eG 
63.24 Divergente citaba fad fed ied Bb 4.4) eee ae, aoa ees 


H Vector integral calculus 


H.1 Line integral of the gradient of a scalar field ................. 
H.2 Divergence theorem ............... e... 
H.2.1 Flux of a vector field for Gaussian surface ............. 
H.2.2 Divergence in cartesian coordinates. ................ 
H:3 ¡Stokes Theoren- e eo keels hide kde Ra a da as 
H31 SA tee ae See Gets Bob we eS 
H.3.2 Curlin cartesian coordinates ............ e... .... 


H.4 Potential formulations of curl-free and divergence-free fields 


I Waveform analysis 


I.1 Harmonic waveform decomposition ........ o... e... e... e... 

1.1.1 Periodic systems and the Fourier Series. .............. 

11.2 Aperiodic systems and the Fourier Transform ........... 

12 Time-sampled waveform analysis ....... 0... o... e... .... 

1.2.1 Delta-function impulse response ..............000- 

12.2 Green's function waveform decomposition ............. 
Bibliography 


Index 


CONTENTS 


La gpl ee nid ch at ode ST 546 


Examples 


Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 


Exploding cannon shell La paar aa Ag Bele hal We we bk a 
Balliard=ball-counstOns ice BIR BR he be DE Ee A RO OE OE 
Bolas:thrown: DY GAUCHO: 22202 20 Bb BS dy eo, al hd A Ge Ak he 
CONE GL FORCES ee she Gg Bee ee aa ee ed be eh SS A os 
Theadeal GOS HOW lato tess hots e a bth Be Ne BE ne eae Bowl ee th hte a 
Themass Of GOALS Ha oles A A aa E E ins kp a eR eta a ae hans 
Diakomac molecule: 2 ies. rain BA Ae ee a gael ae ee ele Re es 
Roller COGSbER os, Bio le te RA oe Bee ed a Bak ot ahh E di tdo S 
Vertical fall in the earth’s gravitational field. ooo... 
Projectile motion INO di ig BE ea cw eR a a eae Sea FE ee Y 
Moment of inertia of a thin door saa srap edsin ears rr aoaaa kaeni 
Merry-G0-T0Und > of es Oe Bed bbe ORE ERE EEO e eae Ree BA es 
Cue pushes a buuard- ball. da a a raw gare doe SS 
Center of percussion of a baseball dat... 
Energy transfer in charged-particle scattering .. ooo... 
Field Of A :UNAJOTMSPRETE cn is te Re yaya a Yee A 
Harmonically-driven series RLC circuito... 
Vibration 180lQtON saioe niia A A A A o 
Water waves breaking on a beach ... oo... o... 
Surface waves for deep water... 
Electromagnetic waves in ionosphere .. o... 
Fourier transform of a Gaussian wave packet: . 2... 0. a 
Fourier transform of a rectangular wave packet: o... 
Acoustic wave packet . 1. 0. g n e a o OO a p a a a E E D i uya 
Gravitational:red Shi Raids AA ee ee ee EG ee 
Quantum baseball. soe ee ot ha a ise ee eS oe RA 
Non-linear oscillator ecs uui du o Ee ee 
Shortest distance between two points . 1... 0. a 
Brachistochrone problem .. 0... ce 
Minimal travel Cost oom. ce ed bo A tok br e be oe oe ee BY 
Surface area of a cylindrically-symmetric soap bubble... 2.0.2.0... 020004 
Fermats Principle si arsane Se a A ee eS 
Minimum of (Vo)? ina volume o... 
Two dependent variables coupled by one holonomic constraint ............4. 
KORNON EE Fen AAA AA bt ee dae DEN o eRe Pea 
The Queen: Dido problems ccs 2 Oe BOR aa Re a a we ee ee a 
Motion of a free particle, U=0 o... 
Motion in a uniform gravitational field . 0... aaa a 
Ce ALY ORCS? soe soa PS Gene ee BR Beas EG Ei fs, tia tee ne oe 
Disk rolling on an inclined plane... o... o... 
Two connected masses on frictionless inclined planes... 1... 0.0.00 ee eee 
Two blocks connected by a frictionless bar 2... ooo... 
Block sliding on a movable frictionless inclined plane .. ooo... 
Sphere rolling without slipping down an inclined plane on a frictionless floor. .... . 
Mass sliding on a rotating straight frictionless rod. o... o... 


xiii 


xiv 


EXAMPLES 


Example: Spherical pendulum ... 0. 6 0 aaa aaa 155 
Example: Spring plane pendulum 0. 0 ee 156 
Examplez The YOsyO xa de tos 50 work, GRE A ek. we BBA A lat A AP Bek oe RE AG 157 
Example: Mass constrained to move on the inside of a frictionless paraboloid ........... 158 
Example: Mass on a frictionless plane connected to a plane pendulum ............0.-. 159 
Example: Two connected masses constrained to slide along a moving rod... ........04. 160 
Example: Mass sliding on a frictionless spherical shell ooo... 161 
Example: Rolling solid sphere on a spherical shell... o... 163 
Example: Solid sphere rolling plus slipping on a spherical shell . 2... 0... ee ee ee 165 
Example: Small body held by friction on the periphery of a rolling wheel .............. 166 
Example: Plane pendulum hanging from a vertically-oscillating support ...........0-. 169 
Example: Series-coupled double pendulum subject to impulsive force... nnana oo... oo. 171 
Example: Feynman's angular-momentum paradoz 1 0. a 180 
Example: Atwoods machine aio a a EP Re a RA a 182 
Example: Conservation of angular momentum for rotational invariance: o... o... 183 
Example: Diatomic molecules and axially-symmetric nucleo... omo... ee 184 
Example: Linear harmonic oscillator on a cart moving at constant velocity .... o... o... 189 
Example: Isotropic central force in a rotating frame .. 0... aaa 190 
Example; The plane pendulum ciar a be Bok eee ae ee Ra ee ce eg ee 191 
Example: Oscillating cylinder in a cylindrical bowl... 191 
Example: Motion in a uniform gravitational field... 0.0.00 0 pe ee 205 
Example: One-dimensional harmonic oscillator 2... 0 205 
Example: Plane pendulum... pe arn e ae d a ee 206 
Example: Hooke’s law force constrained to the surface of a cylinder ....o.o.o ooo... oo... 207 
Example: Electron motion in a cylindrical magnetron .. saasaa ee 208 
Example: Spherical pendulum using Hamiltonian mechanics . ooo... 213 
Example: Spherical pendulum using Reycticlr, 0, Q, $, 0, po) 6 ee 214 
Example: Spherical pendulum using Rnoneyctic(r, O, Q, Pr, Po, 6) OA a E dios de 215 
Example: Single particle moving in a vertical plane under the influence of an inverse-square 

CONTPOLJORCO: ¿A a A Re LT IA E AS diaa AS ds ce he Mes Yea teeta a tt CNS gt 216 
Example: Folded Choi E A a a ada ee Ee ds 217 
Example; Falling chain... 64.8 4.68 De y Bo eae E A a A e ey 218 
Example: Gauge invariance in electromagnetism ... o... oo. 233 
Example: Driven, linearly-damped, coupled linear oscillators... 2. 0. 0 a 243 
Example: Kirchhoff’s rules for electrical Circuits... oo... 244 
Example: The linearly-damped, linear oscillator: 2... 0. 245 
Example: Central force leading to a circular orbit r=2Rcos@ ....o.o.o.o.o o... o... 254 
Example: Orbit equation of motion for a free body .... o... 256 
Example: Linear two-body restoring force . o... o... 269 
Example: Inverse square law attractive forte... 269 
Example: Attractive inverse cubic central force... o... oo. 270 
Example: Spiralling mass attached by a string to a hanging mass ... o... 271 
Example: Two-body scattering by an inverse cubic force .. o... oo. 277 
Example: Accelerating spring plane pendulum oo... 295 
Example: Surface of rotating liquid... o... 297 
Example: The: pirouette: taria ess a rar a Ee a 298 
Example: Cranked plane pendulum .. 0. 0 0 300 
Example: Nucleon orbits in deformed nuclei... 0... ee 301 
Example: Free fall: from Pest ise: id ye BS OR a Hoar bes ein SUSE PRE 305 
Example: Projectile fired vertically upwards 2... 0. ee 305 
Example: Motion parallel to Earth’s surface... 06. 305 
Example: Inertia tensor of a solid cube rotating about the center of mass. ............. 320 
Example: Inertia tensor of about a corner of a solid cube... 2... ee 321 
Example: Inertia tensor of a hula hoop . 1... 0. aaa aaa 323 
Example: Inertia tensor of a thin book... 0... a 323 


EXAMPLES 


13.5 Example: 
13.6 Example: 
13.7 Example: 
13.8 Example: 
13.9 Example: 
13.10 Example: 
13.11 Example: 
13.12 Example: 
13.13 Example: 
13.14 Example: 
13.15 Example: 
13.16 Example: 
13.17 Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
14.10 Example: 
14.11 Example: 
14.12 Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
15.10 Example: 
15.11 Example: 
15.12 Example: 
15.13 Example: 
15.14 Example: 
15.15 Example: 
15.16 Example: 
15.17 Example: 
15.18 Example: 
15.19 Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
A.1 Example: 
A.2 Example: 
D.1 Example: 


14.1 
14.2 
14.3 
14.4 
14.5 
14.6 
14.7 
14.8 
14.9 


15.1 
15.2 
15.3 
15.4 
15.5 
15.6 
15.7 
15.8 
15.9 


16.1 
17.1 
17.2 
17.3 
17.4 
17.5 
17.6 
17.7 


xv 
Rotation about the center of mass of a solid cube .....o.o.o.o. oo... 325 
Rotation about the corner of the cube .. naaa ahaaa aaa 326 
Euler angle transformation... 331 
Rotation of à dumbbell ess suce e a ie a a p e ee i es 336 
Precession rate for torque-free rotating symmetric rigid rotor... ......0000. 342 
Tennis racquet dynamils 6 ee 345 
Rotation of asymmetrically-deformed nuclei... 0... 346 
TRE ORORO CIGCK evista’ gree a: eae. AL Bose Wh Bde Garb PP goa aa 349 
The Tippe TOD: ssc fe, Wty dee Re hk BEA, Oy, Be od ae a es te es eyed ed 350 
Tipping stability of a rolling wheel .. 0... a 353 
Pivoting ed aed a a dt oaks GER ada ee le, de tee a ae an Geena eB 354 
Rolls aa ae tik pi A ta OR BOREL Od howl A 354 
Forces on the bearings of a rotating circular disk . 2.1... 0. ee ee 355 
The Grand: Piano iii AAA TE A es AA oP AA ea A 368 
Two coupled linear oscillators ....... o... 374 
Two equal masses series-coupled by two equal Springs .... o... 376 
Two parallel-coupled plane pendula ....... o... 377 
The series-coupled double plane pendula .....o.o.o.o.o.o.o a 379 
Three plane pendula; mean-field linear coupling ooo... 380 
Three plane pendula; nearest-neighbor coupling ... 1... ee 382 
System of three bodies coupled by six springs... 1... 0 384 
Linear triatomic molecular COg i 385 
Benzene TING? 2 a hak dd AR ek a a Bias Maen da Bok Ge Se hk Ae des 387 
Two linearly-damped coupled linear oscillators .. 2... o... o... 394 
Collective motion in nuclea... 397 
Check that a transformation is canonical... nonoa aaa 406 
ANGHOF MOMENTUME- . lo A E Ee a ee ad SSS 409 
Lorentz force in electromagnetism o... 412 
WavEMORÓNI =e oee es as E A A oleh he eee eee Bie le bok d 412 
Two-dimensional, anisotropic, linear oscillator .. . 0... 413 
Theveccentracity Vector ii A E A ea Oe eee EE Go 414 
The identity canonical transformation . 2... 0 aaa a 420 
The point canonical transformation... ee 420 
The exchange canonical transformation... 6... a 420 
Infinitessimal point canonical transformation .. o... 420 
1-D harmonic oscillator via a canonical transformation... o. onau ee 421 
Pree Particle: $e td da MS AA A A AA Be la au Dio Bg 425 
Point particle in a uniform gravitational field ......o.o.o.o.o.o.o.o ooo... 426 
One-dimensional harmonic oscillator .. 2... 427 
The central force problem... 2. 0 a 427 
Linearly-damped, one-dimensional, harmonic oscillator... 0... 0 ee ee 429 
Adiabatic invariance for the simple pendulum .... 0.0... 0000 cee eee ee 436 
Harmonic oscillator perturbation .... aaa 438 
Lindblad resonance in planetary and galactic motion o... o... 439 
Acoustic waves IN-O.908. ff bo ee ee De bee be ee E 459 
Mion MEME sla lattes cet bak bit de aoe e e a 470 
Relativistic Doppler Effect .. 0. 0 471 
LWIPATAdOD: kos Go AS da A A A IA A oR 471 
Rocket: PrODULSLON: aria Gt a do SE DOE i le ake Pawar ae: a la 474 
Lagrangian for a relativistic free particle .. ooo... 482 
Relativistic particle in an external electromagnetic field . ooo... o... o... 483 
The Bohr-Sommerfeld hydrogen atom... 0. aaa 487 
Eigenvalues and eigenvectors of a real symmetric matriz o... 512 
Degenerate eigenvalues of real symmetric matrix 2... o... 513 
ROCHON Matis ce ee ke a ee eh A ee A Be ER, oy ee 527 


xvi 


Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 
Example: 


EXAMPLES 


Proof that a rotation matriz is orthogonal oo... 528 
Displacement gradient tensor 2... aaa 534 
Jacobian for transform from cartesian to spherical coordinates... ........... 541 
Mazxwell's Flux Equations . 0... 549 
Buoyancy forces in fluids e aeea r epp e a E E w a o a e e a 550 
Mazwell’s circulation equations .... e 552 
Electromagnetic fields: e gadi eee p E a ee 553 
Fourier transform of a single isolated square pulse: .. 2... ee o. 558 
Fourier transform of the Dirac delta function: .. . 0.000 ee 558 


Preface 


The goal of this book is to introduce the reader to the intellectual beauty, and philosophical implications, 
of the fact that nature obeys variational principles plus Hamilton’s Action Principle which underlie the 
Lagrangian and Hamiltonian analytical formulations of classical mechanics. These variational methods, 
which were developed for classical mechanics during the 18” — 19'” century, have become the preeminent 
formalisms for classical dynamics, as well as for many other branches of modern science and engineering. 
The ambitious goal of this book is to lead the reader from the intuitive Newtonian vectorial formulation, to 
introduction of the more abstract variational principles that underlie Hamilton’s Principle and the related 
Lagrangian and Hamiltonian analytical formulations. This culminates in discussion of the contributions of 
variational principles to classical mechanics and the development of relativistic and quantum mechanics. 
The broad scope of this book attempts to unify the undergraduate physics curriculum by bridging the 
chasm that divides the Newtonian vector-differential formulation, and the integral variational formulation of 
classical mechanics, as well as the corresponding philosophical approaches adopted in classical and quantum 
mechanics. This book introduces the powerful variational techniques in mathematics, and their application to 
physics. Application of the concepts of the variational approach to classical mechanics is ideal for illustrating 
the power and beauty of applying variational principles. 

The development of this textbook was influenced by three textbooks: The Variational Principles of 
Mechanics by Cornelius Lanczos (1949) [La49], Classical Mechanics (1950) by Herbert Goldstein[Go50], 
and Classical Dynamics of Particles and Systems (1965) by Jerry B. Marion[Ma65]. Marion's excellent 
textbook was unusual in partially bridging the chasm between the outstanding graduate texts by Goldstein 
and Lanczos, and a bevy of introductory texts based on Newtonian mechanics that were available at that 
time. The present textbook was developed to provide a more modern presentation of the techniques and 
philosophical implications of the variational approaches to classical mechanics, with a breadth and depth 
close to that provided by Goldstein and Lanczos, but in a format that better matches the needs of the 
undergraduate student. An additional goal is to bridge the gap between classical and modern physics in the 
undergraduate curriculum. The underlying philosophical approach adopted by this book was espoused by 
Galileo Galilei “You cannot teach a man anything; you can only help him find it within himself.” 

This book was written in support of the physics junior/senior undergraduate course P235W entitled 
“Variational Principles in Classical Mechanics” that the author taught at the University of Rochester be- 
tween 1993—2015. Initially the lecture notes were distributed to students to allow pre-lecture study, facilitate 
accurate transmission of the complicated formulae, and minimize note taking during lectures. These lecture 
notes evolved into the present textbook. The target audience of this course typically comprised ~ 70% ju- 
nior/senior undergraduates, = 25% sophomores, < 5% graduate students, and the occasional well-prepared 
freshman. The target audience was physics and astrophysics majors, but the course attracted a significant 
fraction of majors from other disciplines such as mathematics, chemistry, optics, engineering, music, and the 
humanities. As a consequence, the book includes appreciable introductory level physics, plus mathematical 
review material, to accommodate the diverse range of prior preparation of the students. This textbook 
includes material that extends beyond what reasonably can be covered during a one-term course. This sup- 
plemental material is presented to show the importance and broad applicability of variational concepts to 
classical mechanics. The book includes 164 worked examples to illustrate the concepts presented. Advanced 
group-theoretic concepts are minimized to better accommodate the mathematical skills of the typical under- 
graduate physics major. To conform with modern literature in this field, this book follows the widely-adopted 
nomenclature used in “Classical Mechanics” by Goldstein[Go50], with recent additions by Johns[Jo05]. 

The second edition of this book has revised the presentation and includes recent developments in the 
field. The book is broken into four major sections, the first of which presents a brief historical introduction 
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(chapter 1), followed by a review of the Newtonian formulation of mechanics plus gravitation (chapter 
2), linear oscillators and wave motion (chapter 3), and an introduction to non-linear dynamics and chaos 
(chapter 4). The second section introduces the variational principles of analytical mechanics that underlie 
this book. It includes an introduction to the calculus of variations (chapter 5), the Lagrangian formulation of 
mechanics with applications to holonomic and non-holonomic systems (chapter 6), a discussion of symmetries, 
invariance, plus Noether's theorem (chapter 7). This book presents an introduction to the Hamiltonian, the 
Hamiltonian formulation of mechanics, the Routhian reduction technique, and a discussion of the subtleties 
involved in applying variational principles to variable-mass problems.(Chapter 8). The second edition of 
this book presents a unified introduction to Hamiltons Principle, introduces a new approach for applying 
Hamilton's Principle to systems subject to initial boundary conditions, and discusses how best to exploit the 
hierarchy of related formulations based on action, Lagrangian/Hamiltonian, and equations of motion, when 
solving problems subject to symmetries (chapter 9). A consolidated introduction to the application of the 
variational approach to nonconservative systems is presented (chapter 10). The third section of the book, 
applies Lagrangian and Hamiltonian formulations of classical dynamics to central force problems (chapter 11), 
motion in non-inertial frames (chapter 12), rigid-body rotation (chapter 13), and coupled linear oscillators 
(chapter 14). The fourth section of the book introduces advanced applications of Hamilton's Action Principle, 
Lagrangian mechanics and Hamiltonian mechanics. These include Poisson brackets, Liouville's theorem, 
canonical transformations, Hamilton-Jacobi theory, the action-angle technique (chapter 15), and classical 
mechanics in the continua (chapter 16). This is followed by a brief review of the revolution in classical 
mechanics introduced by Einstein’s theory of relativistic mechanics. The extended theory of Lagrangian and 
Hamiltonian mechanics is used to apply variational techniques to the Special Theory of Relativity, followed 
by a discussion of the use of variational principles in the development of the General Theory of Relativity 
(chapter 17). The book finishes with a brief review of the role of variational principles in bridging the gap 
between classical mechanics and quantum mechanics, (chapter 18). These advanced topics extend beyond 
the typical syllabus for an undergraduate classical mechanics course. They are included to stimulate student 
interest in physics by giving them a glimpse of the physics at the summit that they have already struggled 
to climb. This glimpse illustrates the breadth of classical mechanics, and the pivotal role that variational 
principles have played in the development of classical, relativistic, quantal, and statistical mechanics. 

The front cover picture of this book shows a sailplane soaring high above the Italian Alps. This picture 
epitomizes the unlimited horizon of opportunities provided when the full dynamic range of variational princi- 
ples are applied to classical mechanics. The adjacent pictures of the galaxy, and the skier, represent the wide 
dynamic range of applicable topics that span from the origin of the universe, to everyday life. These cover 
pictures reflect the beauty and unity of the foundation provided by variational principles to the development 
of classical mechanics. 

Information regarding the associated P235 undergraduate course at the University of Rochester is avail- 
able on the web site at http://www.pas.rochester.edu/~ cline/P235/index.shtml. Information about the 
author is available at the Cline home web site: http://www.pas.rochester.edu/~ cline/index.html. 

The author thanks Meghan Sarkis who prepared many of the illustrations, Joe Easterly who designed the 
book cover plus the webpage, and Moriana Garcia who organized publication. Andrew Sifain developed the 
diagnostic workshop questions. The author appreciates the permission, granted by Professor Struckmeier, to 
quote his published article on the extended Hamilton-Lagrangian formalism. The author acknowledges the 
feedback and suggestions made by many students who have taken this course, as well as helpful suggestions 
by his colleagues; Andrew Abrams, Adam Hayes, Connie Jones, Andrew Melchionna, David Munson, Alice 
Quillen, Richard Sarkis, James Schneeloch, Steven Torrisi, Dan Watson, and Frank Wolfs. These lecture 
notes were typed in LATEX using Scientific WorkPlace (MacKichan Software, Inc.), while Adobe Illustrator, 
Photoshop, Origin, Mathematica, and MUPAD, were used to prepare the illustrations. 


Douglas Cline, 
University of Rochester, 2018 


Prologue 


Two dramatically different philosophical approaches to science were developed in the field of classical me- 
chanics during the 17*” - 18*” centuries. This time period coincided with the Age of Enlightenment in Europe 
during which remarkable intellectual and philosophical developments occurred. This was a time when both 
philosophical and causal arguments were equally acceptable in science, in contrast with current convention 
where there appears to be tacit agreement to discourage use of philosophical arguments in science. 


Snell’s Law: The genesis of two contrasting philosophical ap- 
proaches to science relates back to early studies of the reflection 
and refraction of light. The velocity of light in a medium of re- 
fractive index n equals v = £. Thus a light beam incident at an 
angle 0,, to the normal of a plane interface between medium 1 
and medium 2, is refracted at an angle 02 in medium 2, where the 


angles are related by Snell’s Law. 


sinfı v n2 
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Ibn Sahl of Bagdad (984) first described the refraction of light, 
while Snell (1621) derived his law mathematically. Both of these 
scientists used the “vectorial approach” where the light velocity v 
is considered to be a vector pointing in the direction of propaga- 
tion. 


Fermat’s Principle: Fermat’s principle of least time (1657), 
which is based on the work of Hero of Alexandria (~ 60) and Ibn 
al-Haytham (1021), states that “light travels between two given 
points along the path of shortest time”. The transit time 7 of a 
light beam between two locations A and B, in a medium with 
position-dependent refractive index n(s), is given by 


tg 1 fB 
T a dt = -| n(s)ds (Fermat’s Principle) 
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Fermat’s Principle leads to the derivation of Snell’s Law. 
Philosophically the physics underlying the contrasting vectorial 
and Fermat’s Principle derivations of Snell’s Law are dramatically 
different. The vectorial approach is based on differential relations 
between the velocity vectors in the two media, whereas Fermat’s 
variational approach is based on the fact that the light prefer- 
entially selects a path for which the integral of the transit time 
between the initial location A and the final location B is mini- 


“Vectorial” 


“Variational” 


B 


Figure 1: Vectorial and variational represen- 
tations of Snell’s Law for refraction of light. 


mized. That is, the first approach is based on “vectorial mechanics” whereas Fermat’s approach is based on 
variational principles in that the path between the initial and final locations is varied to find the path that 
minimizes the transit time. Fermat’s enunciation of variational principles in physics played a key role in the 
historical development, and subsequent exploitation, of the principle of least action in analytical formulations 


of classical mechanics as discussed below. 
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Newtonian mechanics: Momentum and force are vectors that underlie the Newtonian formulation of 
classical mechanics. Newton’s monumental treatise, entitled “Philosophiae Naturalis Principia Mathemat- 
ica”, published in 1687, established his three universal laws of motion, the universal theory of gravitation, 
the derivation of Kepler’s three laws of planetary motion, and the development of calculus. Newton’s three 
universal laws of motion provide the most intuitive approach to classical mechanics in that they are based on 
vector quantities like momentum, and the rate of change of momentum, which are related to force. Newton’s 
equation of motion 

_@Pp 

dt 
is a vector differential relation between the instantaneous forces and rate of change of momentum, or equiva- 
lent instantaneous acceleration, all of which are vector quantities. Momentum and force are easy to visualize, 
and both cause and effect are embedded in Newtonian mechanics. Thus, if all of the forces, including the 
constraint forces, acting on the system are known, then the motion is solvable for two body systems. The 
mathematics for handling Newton's “vectorial mechanics” approach to classical mechanics is well established. 


(Newton's equation of motion) 


Analytical mechanics: Variational principles apply to many aspects of our daily life. Typical examples 
include; selecting the optimum compromise in quality and cost when shopping, selecting the fastest route 
to travel from home to work, or selecting the optimum compromise to satisfy the disparate desires of the 
individuals comprising a family. Variational principles underlie the analytical formulation of mechanics. It 
is astonishing that the laws of nature are consistent with variational principles involving the principle of 
least action. Minimizing the action integral led to the development of the mathematical field of variational 
calculus, plus the analytical variational approaches to classical mechanics, by Euler, Lagrange, Hamilton, 
and Jacobi. 

Leibniz, who was a contemporary of Newton, introduced methods based on a quantity called “vis viva”, 
which is Latin for “living force” and equals twice the kinetic energy. Leibniz believed in the philosophy 
that God created a perfect world where nature would be thrifty in all its manifestations. In 1707, Leibniz 
proposed that the optimum path is based on minimizing the time integral of the vis viva, which is equiva- 
lent to the action integral of Lagrangian/Hamiltonian mechanics. In 1744 Euler derived the Leibniz result 
using variational concepts while Maupertuis restated the Leibniz result based on teleological arguments. 
The development of Lagrangian mechanics culminated in the 1788 publication of Lagrange’s monumental 
treatise entitled “Mécanique Analytique”. Lagrange used d'Alembert's Principle to derive Lagrangian me- 
chanics providing a powerful analytical approach to determine the magnitude and direction of the optimum 
trajectories, plus the associated forces. 

The culmination of the development of analytical mechanics occurred in 1834 when Hamilton proposed 
his Principle of Least Action, as well as developing Hamiltonian mechanics which is the premier variational 
approach in science. Hamilton's concept of least action is defined to be the time integral of the Lagrangian. 
Hamilton's Action Principle (1834) minimizes the action integral S defined by 


B 
gal L(q, q,t)dt (Hamilton’s Principle) 
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In the simplest form, the Lagrangian L(q, q,t) equals the difference between the kinetic energy T and the 
potential energy U. Hamilton’s Least Action Principle underlies Lagrangian mechanics. This Lagrangian is 
a function of n generalized coordinates q; plus their corresponding velocities ¢;. Hamilton also developed 
the premier variational approach, called Hamiltonian mechanics, that is based on the Hamiltonian H (q, p,t) 
which is a function of the n fundamental position q; plus the conjugate momentum p; variables. In 1843 
Jacobi provided the mathematical framework required to fully exploit the power of Hamiltonian mechanics. 
Note that the Lagrangian, Hamiltonian, and the action integral, all are scalar quantities which simplifies 
derivation of the equations of motion compared with the vector calculus used by Newtonian mechanics. 
Figure 2 presents a philosophical roadmap illustrating the hierarchy of philosophical approaches based on 
Hamilton’s Action Principle, that are available for deriving the equations of motion of a system. The primary 
Stagel uses Hamilton’s Action functional, S = f ie L(q, q,t)dt to derive the Lagrangian, and Hamiltonian 
functionals which provide the most fundamental and sophisticated level of understanding. Stagel involves 
specifying all the active degrees of freedom, as well as the interactions involved. Stage2 uses the Lagrangian 
or Hamiltonian functionals, derived at Stagel, in order to derive the equations of motion for the system of 
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Figure 2: Philosophical road map of the hierarchy of stages involved in analytical mechanics. Hamilton's 
Action Principle is the foundation of analytical mechanics. Stage 1 uses Hamilton's Principle to derive the 
Lagranian and Hamiltonian. Stage 2 uses either the Lagrangian or Hamiltonian to derive the equations 
of motion for the system. Stage 3 uses these equations of motion to solve for the actual motion using 
the assumed initial conditions. The Lagrangian approach can be derived directly based on d'Alembert's 
Principle. Newtonian mechanics can be derived directly based on Newton's Laws of Motion. The advantages 
and power of Hamilton's Action Principle are unavailable if the Laws of Motion are derived using either 
d’Alembert’s Principle or Newton's Laws of Motion. 


d' Alembert’s Principle 


interest. Stage3 then uses these derived equations of motion to solve for the motion of the system subject to 
a given set of initial boundary conditions. Note that Lagrange first derived Lagrangian mechanics based on 
d’ Alembert’s Principle, while Newton’s Laws of Motion specify the equations of motion used in Newtonian 
mechanics. 


The analytical approach to classical mechanics appeared contradictory to Newton’s intuitive vector- 
ial treatment of force and momentum. There is a dramatic difference in philosophy between the vector- 
differential equations of motion derived by Newtonian mechanics, which relate the instantaneous force to 
the corresponding instantaneous acceleration, and analytical mechanics, where minimizing the scalar action 
integral involves integrals over space and time between specified initial and final states. Analytical mechanics 
uses variational principles to determine the optimum trajectory, from a continuum of tentative possibilities, 
by requiring that the optimum trajectory minimizes the action integral between specified initial and final 
conditions. 

Initially there was considerable prejudice and philosophical opposition to use of the variational principles 
approach which is based on the assumption that nature follows the principles of economy. The variational 
approach is not intuitive, and thus it was considered to be speculative and “metaphysical”, but it was 
tolerated as an efficient tool for exploiting classical mechanics. This opposition to the variational principles 
underlying analytical mechanics, delayed full appreciation of the variational approach until the start of the 
20*” century. As a consequence, the intuitive Newtonian formulation reigned supreme in classical mechanics 
for over two centuries, even though the remarkable problem-solving capabilities of analytical mechanics were 
recognized and exploited following the development of analytical mechanics by Lagrange. 

The full significance and superiority of the analytical variational formulations of classical mechanics 
became well recognised and accepted following the development of the Special Theory of Relativity in 1905. 
The Theory of Relativity requires that the laws of nature be invariant to the reference frame. This is not 
satisfied by the Newtonian formulation of mechanics which assumes one absolute frame of reference and a 
separation of space and time. In contrast, the Lagrangian and Hamiltonian formulations of the principle of 
least action remain valid in the Theory of Relativity, if the Lagrangian is written in a relativistically-invariant 
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form in space-time. The complete invariance of the variational approach to coordinate frames is precisely 
the formalism necessary for handling relativistic mechanics. 

Hamiltonian mechanics, which is expressed in terms of the conjugate variables (q, p), relates classical 
mechanics directly to the underlying physics of quantum mechanics and quantum field theory. As a conse- 
quence, the philosophical opposition to exploiting variational principles no longer exists, and Hamiltonian 
mechanics has become the preeminent formulation of modern physics. The reader is free to draw their own 
conclusions regarding the philosophical question “is the principle of economy a fundamental law of classical 
mechanics, or is it a fortuitous consequence of the fundamental laws of nature?” 

From the late seventeenth century, until the dawn of modern physics at the start of the twentieth cen- 
tury, classical mechanics remained a primary driving force in the development of physics. Classical mechanics 
embraces an unusually broad range of topics spanning motion of macroscopic astronomical bodies to mi- 
croscopic particles in nuclear and particle physics, at velocities ranging from zero to near the velocity of 
light, from one-body to statistical many-body systems, as well as having extensions to quantum mechanics. 
Introduction of the Special Theory of Relativity in 1905, and the General Theory of Relativity in 1916, 
necessitated modifications to classical mechanics for relativistic velocities, and can be considered to be an 
extended theory of classical mechanics. Since the 1920’s, quantal physics has superseded classical mechanics 
in the microscopic domain. Although quantum physics has played the leading role in the development of 
physics during much of the past century, classical mechanics still is a vibrant field of physics that recently 
has led to exciting developments associated with non-linear systems and chaos theory. This has spawned 
new branches of physics and mathematics as well as changing our notion of causality. 


Goals: The primary goal of this book is to introduce the reader to the powerful variational-principles 
approaches that play such a pivotal role in classical mechanics and many other branches of modern science 
and engineering. This book emphasizes the intellectual beauty of these remarkable developments, as well as 
stressing the philosophical implications that have had a tremendous impact on modern science. A secondary 
goal is to apply variational principles to solve advanced applications in classical mechanics in order to 
introduce many sophisticated and powerful mathematical techniques that underlie much of modern physics. 

This book starts with a review of Newtonian mechanics plus the solutions of the corresponding equations 
of motion. This is followed by an introduction to Lagrangian mechanics, based on d’Alembert’s Principle, 
in order to develop familiarity in applying variational principles to classical mechanics. This leads to intro- 
duction of the more fundamental Hamilton’s Action Principle, plus Hamiltonian mechanics, to illustrate the 
power provided by exploiting the full hierarchy of stages available for applying variational principles to clas- 
sical mechanics. Finally the book illustrates how variational principles in classical mechanics were exploited 
during the development of both relativisitic mechanics and quantum physics. The connections and applica- 
tions of classical mechanics to modern physics, are emphasized throughout the book in an effort to span the 
chasm that divides the Newtonian vector-differential formulation, and the integral variational formulation, of 
classical mechanics. This chasm is especially applicable to quantum mechanics which is based completely on 
variational principles. Note that variational principles, developed in the field of classical mechanics, now are 
used in a diverse and wide range of fields outside of physics, including economics, meteorology, engineering, 
and computing. 

This study of classical mechanics involves climbing a vast mountain of knowledge, and the pathway to the 
top leads to elegant and beautiful theories that underlie much of modern physics. This book exploits varia- 
tional principles applied to four major topics in classical mechanics to illustrate the power and importance of 
variational principles in physics. Being so close to the summit provides the opportunity to take a few extra 
steps beyond the normal introductory classical mechanics syllabus to glimpse the exciting physics found at 
the summit. This new physics includes topics such as quantum, relativistic, and statistical mechanics. 


Chapter 1 


A brief history of classical mechanics 


1.1 Introduction 


This chapter reviews the historical evolution of classical mechanics since considerable insight can be gained 
from study of the history of science. There are two dramatically different approaches used in classical 
mechanics. The first is the vectorial approach of Newton which is based on vector quantities like momentum, 
force, and acceleration. The second is the analytical approach of Lagrange, Euler, Hamilton, and Jacobi, 
that is based on the concept of least action and variational calculus. The more intuitive Newtonian picture 
reigned supreme in classical mechanics until the start of the twentieth century. Variational principles, which 
were developed during the nineteenth century, never aroused much enthusiasm in scientific circles due to 
philosophical objections to the underlying concepts; this approach was merely tolerated as an efficient tool 
for exploiting classical mechanics. A dramatic advance in the philosophy of science occurred at the start of 
the 20*” century leading to widespread acceptance of the superiority of using variational principles. 


1.2 Greek antiquity 


The great philosophers in ancient Greece played a key role by using the astronomical work of the Babylonians 
to develop scientific theories of mechanics. Thales of Miletus (624 - 547BC), the first of the seven 
great greek philosophers, developed geometry, and is hailed as the first true mathematician. Pythagorus 
(570 - 495BC) developed mathematics, and postulated that the earth is spherical. Democritus (460 - 
370BC) has been called the father of modern science, while Socrates (469 - 399BC) is renowned for his 
contributions to ethics. Plato (427-347 B.C.) who was a mathematician and student of Socrates, wrote 
important philosophical dialogues. He founded the Academy in Athens which was the first institution of 
higher learning in the Western world that helped lay the foundations of Western philosophy and science. 
Aristotle (384-322 B.C.) is an important founder of Western philosophy encompassing ethics, logic, 
science, and politics. His views on the physical sciences profoundly influenced medieval scholarship that 
extended well into the Renaissance. He presented the first implied formulation of the principle of virtual 
work in statics, and his statement that “what is lost in velocity is gained in force” is a veiled reference to 
kinetic and potential energy. He adopted an Earth centered model of the universe. Aristarchus (310 - 240 
B.C.) argued that the Earth orbited the Sun and used measurements to imply the relative distances of the 
Moon and the Sun. The greek philosophers were relatively advanced in logic and mathematics and developed 
concepts that enabled them to calculate areas and perimeters. Unfortunately their philosophical approach 
neglected collecting quantitative and systematic data that is an essential ingredient to the advancement of 
science. 

Archimedes (287-212 B.C.) represented the culmination of science in ancient Greece. As an engineer 
he designed machines of war, while as a scientist he made significant contributions to hydrostatics and 
the principle of the lever. As a mathematician, he applied infinitessimals in a way that is reminiscent of 
modern integral calculus, which he used to derive a value for 7. Unfortunately much of the work of the 
brilliant Archimedes subsequently fell into oblivion. Hero of Alexandria (10 - 70 A.D.) described the 
principle of reflection that light takes the shortest path. This is an early illustration of variational principle 
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of least time. Ptolemy (83 - 161 A.D.) wrote several scientific treatises that greatly influenced subsequent 
philosophers. Unfortunately he adopted the incorrect geocentric solar system in contrast to the heliocentric 
model of Aristarchus and others. 


1.3 Middle Ages 


The decline and fall of the Roman Empire in ~410 A.D. marks the end of Classical Antiquity, and the 
beginning of the Dark Ages in Western Europe (Christendom), while the Muslim scholars in Eastern Europe 
continued to make progress in astronomy and mathematics. For example, in Egypt, Alhazen (965 - 1040 
A.D.) expanded the principle of least time to reflection and refraction. The Dark Ages involved a long 
scientific decline in Western Europe that languished for about 900 years. Science was dominated by religious 
dogma, all western scholars were monks, and the important scientific achievements of Greek antiquity were 
forgotten. The works of Aristotle were reintroduced to Western Europe by Arabs in the early 13%” century 
leading to the concepts of forces in static systems which were developed during the fourteenth century. 
This included concepts of the work done by a force, and the virtual work involved in virtual displacements. 
Leonardo da Vinci (1452-1519) was a leader in mechanics at that time. He made seminal contributions 
to science, in addition to his well known contributions to architecture, engineering, sculpture, and art. 

Nicolaus Copernicus (1473-1543) rejected the geocentric theory of Ptolomy and formulated a scientifically- 
based heliocentric cosmology that displaced the Earth from the center of the universe. The Ptolomic view 
was that heaven represented the perfect unchanging divine while the earth represented change plus chaos, 
and the celestial bodies moved relative to the fixed heavens. The book, De revolutionibus orbium coelestium 
(On the Revolutions of the Celestial Spheres), published by Copernicus in 1543, is regarded as the starting 
point of modern astronomy and the defining epiphany that began the Scientific Revolution. The book De 
Magnete written in 1600 by the English physician William Gilbert (1540-1603) presented the results of 
well-planned studies of magnetism and strongly influenced the intellectual-scientific evolution at that time. 

Johannes Kepler (1571-1630), a German mathematician, astronomer and astrologer, was a key 
figure in the 17th century Scientific Revolution. He is best known for recognizing the connection between the 
motions in the sky and physics. His laws of planetary motion were developed by later astronomers based on 
his written work Astronomia nova, Harmonices Mundi, and Epitome of Copernican Astrononomy. Kepler 
was an assistant to Tycho Brahe (1546-1601) who for many years recorded accurate astronomical data 
that played a key role in the development of Kepler’s theory of planetary motion. Kepler’s work provided 
the foundation for Isaac Newton’s theory of universal gravitation. Unfortunately Kepler did not recognize 
the true nature of the gravitational force. 

Galileo Galilei (1564-1642) built on the Aristotle principle by recognizing the law of inertia, the 
persistence of motion if no forces act, and the proportionality between force and acceleration. This amounts 
to recognition of work as the product of force times displacement in the direction of the force. He applied 
virtual work to the equilibrium of a body on an inclined plane. He also showed that the same principle 
applies to hydrostatic pressure that had been established by Archimedes, but he did not apply his concepts 
in classical mechanics to the considerable knowledge base on planetary motion. Galileo is famous for the 
apocryphal story that he dropped two cannon balls of different masses from the Tower of Pisa to demonstrate 
that their speed of descent was independent of their mass. 


1.4 Age of Enlightenment 


The Age of Enlightenment is a term used to describe a phase in Western philosophy and cultural life in 
which reason was advocated as the primary source and legitimacy for authority. It developed simultaneously 
in Germany, France, Britain, the Netherlands, and Italy around the 1650’s and lasted until the French 
Revolution in 1789. The intellectual and philosophical developments led to moral, social, and political 
reforms. The principles of individual rights, reason, common sense, and deism were a revolutionary departure 
from the existing theocracy, autocracy, oligarchy, aristocracy, and the divine right of kings. It led to political 
revolutions in France and the United States. It marks a dramatic departure from the Early Modern period 
which was noted for religious authority, absolute state power, guild-based economic systems, and censorship 
of ideas. It opened a new era of rational discourse, liberalism, freedom of expression, and scientific method. 
This new environment led to tremendous advances in both science and mathematics in addition to music, 
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literature, philosophy, and art. Scientific development during the 17°” century included the pivotal advances 
made by Newton and Leibniz at the beginning of the revolutionary Age of Enlightenment, culminating in the 
development of variational calculus and analytical mechanics by Euler and Lagrange. The scientific advances 
of this age include publication of two monumental books Philosophiae Naturalis Principia Mathematica by 
Newton in 1687 and Mécanique analytique by Lagrange in 1788. These are the definitive two books upon 
which classical mechanics is built. 


René Descartes (1596-1650) attempted to formulate the laws of motion in 1644. He talked about 
conservation of motion (momentum) in a straight line but did not recognize the vector character of momen- 
tum. Pierre de Fermat (1601-1665) and René Descartes were two leading mathematicians in the first 
half of the 17*” century. Independently they discovered the principles of analytic geometry and developed 
some initial concepts of calculus. Fermat and Blaise Pascal (1623-1662) were the founders of the theory 
of probability. 


Isaac Newton (1642-1727) made pioneering contributions to physics and mathematics as well as 
being a theologian. At 18 he was admitted to Trinity College Cambridge where he read the writings of 
modern philosophers like Descartes, and astronomers like Copernicus, Galileo, and Kepler. By 1665 he had 
discovered the generalized binomial theorem, and began developing infinitessimal calculus. Due to a plague, 
the university closed for two years in 1665 during which Newton worked at home developing the theory 
of calculus that built upon the earlier work of Barrow and Descartes. He was elected Lucasian Professor 
of Mathematics in 1669 at the age of 26. From 1670 Newton focussed on optics leading to his Hypothesis 
of Light published in 1675 and his book Opticks in 1704. Newton described light as being made up of a 
flow of extremely subtle corpuscles that also had associated wavelike properties to explain diffraction and 
optical interference that he studied. Newton returned to mechanics in 1677 by studying planetary motion 
and gravitation that applied the calculus he had developed. In 1687 he published his monumental treatise 
entitled Philosophiae Naturalis Principia Mathematica which established his three universal laws of motion, 
the universal theory of gravitation, derivation of Kepler’s three laws of planetary motion, and was his first 
publication of the development of calculus which he called “the science of fluxions”. Newton’s laws of motion 
are based on the concepts of force and momentum, that is, force equals the rate of change of momentum. 
Newton’s postulate of an invisible force able to act over vast distances led him to be criticized for introducing 
“occult agencies” into science. In a remarkable achievement, Newton completely solved the laws of mechanics. 
His theory of classical mechanics and of gravitation reigned supreme until the development of the Theory 
of Relativity in 1905. The followers of Newton envisioned the Newtonian laws to be absolute and universal. 
This dogmatic reverence of Newtonian mechanics prevented physicists from an unprejudiced appreciation of 
the analytic variational approach to mechanics developed during the 17” through 19°” centuries. Newton 
was the first scientist to be knighted and was appointed president of the Royal Society. 


Gottfried Leibniz (1646-1716) was a brilliant German philosopher, a contemporary of Newton, who 
worked on both calculus and mechanics. Leibniz started development of calculus in 1675, ten years after 
Newton, but Leibniz published his work in 1684, which was three years before Newton’s Principia. Leibniz 
made significant contributions to integral calculus and developed the notation currently used in calculus. 
He introduced the name calculus based on the Latin word for the small stone used for counting. Newton 
and Leibniz were involved in a protracted argument over who originated calculus. It appears that Leibniz 
saw drafts of Newton’s work on calculus during a visit to England. Throughout their argument Newton 
was the ghost writer of most of the articles in support of himself and he had them published under non- 
de-plume of his friends. Leibniz made the tactical error of appealing to the Royal Society to intercede on 
his behalf. Newton, as president of the Royal Society, appointed his friends to an “impartial” committee to 
investigate this issue, then he wrote the committee’s report that accused Leibniz of plagiarism of Newton’s 
work on calculus, after which he had it published by the Royal Society. Still unsatisfied he then wrote an 
anonymous review of the report in the Royal Society’s own periodical. This bitter dispute lasted until the 
death of Leibniz. When Leibniz died his work was largely discredited. The fact that he falsely claimed to be 
a nobleman and added the prefix “von” to his name, coupled with Newton’s vitriolic attacks, did not help 
his credibility. Newton is reported to have declared that he took great satisfaction in “breaking Leibniz’s 
heart.” Studies during the 20” century have largely revived the reputation of Leibniz and he is recognized 
to have made major contributions to the development of calculus. 
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Figure 1.1: Chronological roadmap of the parallel development of the Newtonian and Variational-principles 
approaches to classical mechanics. 
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1.5 Variational methods in physics 


Pierre de Fermat (1601-1665) revived the principle of least time, which states that light travels between 
two given points along the path of shortest time and was used to derive Snell’s law in 1657. This enunciation 
of variational principles in physics played a key role in the historical development of the variational principle 
of least action that underlies the analytical formulations of classical mechanics. 

Gottfried Leibniz (1646-1716) made significant contributions to the development of variational prin- 
ciples in classical mechanics. In contrast to Newton’s laws of motion, which are based on the concept of 
momentum, Leibniz devised a new theory of dynamics based on kinetic and potential energy that anticipates 
the analytical variational approach of Lagrange and Hamilton. Leibniz argued for a quantity called the “vis 
viva”, which is Latin for living force, that equals twice the kinetic energy. Leibniz argued that the change 
in kinetic energy is equal to the work done. In 1687 Leibniz proposed that the optimum path is based on 
minimizing the time integral of the vis viva, which is equivalent to the action integral. Leibniz used both 
philosophical and causal arguments in his work which were acceptable during the Age of Enlightenment. Un- 
fortunately for Leibniz, his analytical approach based on energies, which are scalars, appeared contradictory 
to Newton’s intuitive vectorial treatment of force and momentum. There was considerable prejudice and 
philosophical opposition to the variational approach which assumes that nature is thrifty in all of its actions. 
The variational approach was considered to be speculative and “metaphysical” in contrast to the causal 
arguments supporting Newtonian mechanics. This opposition delayed full appreciation of the variational 
approach until the start of the 20" century. 

Johann Bernoulli (1667-1748) was a Swiss mathematician who was a student of Leibniz’s calculus, and 
sided with Leibniz in the Newton-Leibniz dispute over the credit for developing calculus. Also Bernoulli sided 
with the Descartes’ vortex theory of gravitation which delayed acceptance of Newton’s theory of gravitation 
in Europe. Bernoulli pioneered development of the calculus of variations by solving the problems of the 
catenary, the brachistochrone, and Fermat’s principle. Johann Bernoulli’s son Daniel played a significant 
role in the development of the well-known Bernoulli Principle in hydrodynamics. 

Pierre Louis Maupertuis (1698-1759) was a student of Johann Bernoulli and conceived the universal 
hypothesis that in nature there is a certain quantity called action which is minimized. Although this bold 
assumption correctly anticipates the development of the variational approach to classical mechanics, he 
obtained his hypothesis by an entirely incorrect method. He was a dilettante whose mathematical prowess 
was behind the high standards of that time, and he could not establish satisfactorily the quantity to be 
minimized. His teleological! argument was influenced by Fermat’s principle and the corpuscle theory of light 
that implied a close connection between optics and mechanics. 

Leonhard Euler (1707-1783) was the preeminent Swiss mathematician of the 18” century and was 
a student of Johann Bernoulli. Euler developed, with full mathematical rigor, the calculus of variations 
following in the footsteps of Johann Bernoulli. Euler used variational calculus to solve minimum/maximum 
isoperimetric problems that had attracted and challenged the early developers of calculus, Newton, Leibniz, 
and Bernoulli. Euler also was the first to solve the rigid-body rotation problem using the three components 
of the angular velocity as kinematical variables. Euler became blind in both eyes by 1766 but that did not 
hinder his prolific output in mathematics due to his remarkable memory and mental capabilities. Euler’s 
contributions to mathematics are remarkable in quality and quantity; for example during 1775 he published 
one mathematical paper per week in spite of being blind. Euler implicitly implied the principle of least 
action using vis visa which is not the exact form explicitly developed by Lagrange. 

Jean le Rond d’Alembert (1717-1785) was a French mathematician and physicist who had the 
clever idea of extending use of the principle of virtual work from statics to dynamics. d’Alembert’s Principle 
rewrites the principle of virtual work in the form 

N 

SOF: — pi)ér; = 0 

i=1 
where the inertial reaction force p is subtracted from the corresponding force F. This extension of the 
principle of virtual work applies equally to both statics and dynamics leading to a single variational principle. 

Joseph Louis Lagrange (1736-1813) was an Italian mathematician and a student of Leonhard Euler. 
In 1788 Lagrange published his monumental treatise on analytical mechanics entitled Mécanique Analytique 


l Teleology is any philosophical account that holds that final causes exist in nature, meaning that — analogous to purposes 
found in human actions — nature inherently tends toward definite ends. 
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which introduces his Lagrangian mechanics analytical technique which is based on d’Alembert’s Principle of 
Virtual Work. Lagrangian mechanics is a remarkably powerful technique that is equivalent to minimizing 
the action integral S defined as 
ta 
S= J Ldt 
ty 


The Lagrangian L frequently is defined to be the difference between the kinetic energy T and potential 
energy V. His theory only required the analytical form of these scalar quantities. In the preface of his 
book he refers modestly to his extraordinary achievements with the statement “The reader will find no 
figures in the work. The methods which I set forth do not require either constructions or geometrical or 
mechanical reasonings: but only algebraic operations, subject to a regular and uniform rule of procedure.” 
Lagrange also introduced the concept of undetermined multipliers to handle auxiliary conditions which 
plays a vital part of theoretical mechanics. William Hamilton, an outstanding figure in the analytical 
formulation of classical mechanics, called Lagrange the “Shakespeare of mathematics,” on account of the 
extraordinary beauty, elegance, and depth of the Lagrangian methods. Lagrange also pioneered numerous 
significant contributions to mathematics. For example, Euler, Lagrange, and d’Alembert developed much of 
the mathematics of partial differential equations. Lagrange survived the French Revolution, and, in spite of 
being a foreigner, Napoleon named Lagrange to the Legion of Honour and made him a Count of the Empire 
in 1808. Lagrange was honoured by being buried in the Pantheon. 

Carl Friedrich Gauss (1777-1855) was a German child prodigy who made many significant contri- 
butions to mathematics, astronomy and physics. He did not work directly on the variational approach, but 
Gauss’s law, the divergence theorem, and the Gaussian statistical distribution are important examples of 
concepts that he developed and which feature prominently in classical mechanics as well as other branches 
of physics, and mathematics. 

Simeon Poisson (1781-1840), was a brilliant mathematician who was a student of Lagrange. He 
developed the Poisson statistical distribution as well as the Poisson equation that features prominently in 
electromagnetic and other field theories. His major contribution to classical mechanics is development, in 
1809, of the Poisson bracket formalism which featured prominently in development of Hamiltonian mechanics 
and quantum mechanics. 

The zenith in development of the variational approach to classical mechanics occurred during the 19% 
century primarily due to the work of Hamilton and Jacobi. 

William Hamilton (1805-1865) was a brilliant Irish physicist, astronomer and mathematician who was 
appointed professor of astronomy at Dublin when he was barely 22 years old. He developed the Hamiltonian 
mechanics formalism of classical mechanics which now plays a pivotal role in modern classical and quantum 
mechanics. He opened an entirely new world beyond the developments of Lagrange. Whereas the Lagrange 
equations of motion are complicated second-order differential equations, Hamilton succeeded in transforming 
them into a set of first-order differential equations with twice as many variables that consider momenta and 
their conjugate positions as independent variables. The differential equations of Hamilton are linear, have 
separated derivatives, and represent the simplest and most desirable form possible for differential equations to 
be used in a variational approach. Hence the name “canonical variables” given by Jacobi. Hamilton exploited 
the d’Alembert principle to give the first exact formulation of the principle of least action which underlies the 
variational principles used in analytical mechanics. The form derived by Euler and Lagrange employed the 
principle in a way that applies only for conservative (scleronomic) cases. A significant discovery of Hamilton 
is his realization that classical mechanics and geometrical optics can be handled from one unified viewpoint. 
In both cases he uses a “characteristic” function that has the property that, by mere differentiation, the 
path of the body, or light ray, can be determined by the same partial differential equations. This solution is 
equivalent to the solution of the equations of motion. 

Carl Gustave Jacob Jacobi (1804-1851), a Prussian mathematician and contemporary of Hamilton, 
made significant developments in Hamiltonian mechanics. He immediately recognized the extraordinary im- 
portance of the Hamiltonian formulation of mechanics. Jacobi developed canonical transformation theory 
and showed that the function, used by Hamilton, is only one special case of functions that generate suit- 
able canonical transformations. He proved that any complete solution of the partial differential equation, 
without the specific boundary conditions applied by Hamilton, is sufficient for the complete integration of 
the equations of motion. This greatly extends the usefulness of Hamilton’s partial differential equations. 
In 1843 Jacobi developed both the Poisson brackets, and the Hamilton-Jacobi, formulations of Hamiltonian 
mechanics. The latter gives a single, first-order partial differential equation for the action function in terms 
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of the n generalized coordinates which greatly simplifies solution of the equations of motion. He also de- 
rived a principle of least action for time-independent cases that had been studied by Euler and Lagrange. 
Jacobi developed a superior approach to the variational integral that, by eliminating time from the integral, 
determined the path without saying anything about how the motion occurs in time. 

James Clerk Maxwell (1831-1879) was a Scottish theoretical physicist and mathematician. His most 
prominent achievement was formulating a classical electromagnetic theory that united previously unrelated 
observations, plus equations of electricity, magnetism and optics, into one consistent theory. Maxwell’s 
equations demonstrated that electricity, magnetism and light are all manifestations of the same phenomenon, 
namely the electromagnetic field. Consequently, all other classic laws and equations of electromagnetism 
were simplified cases of Maxwell’s equations. Maxwell’s achievements concerning electromagnetism have 
been called the “second great unification in physics”. Maxwell demonstrated that electric and magnetic 
fields travel through space in the form of waves, and at a constant speed of light. In 1864 Maxwell wrote “A 
Dynamical Theory of the Electromagnetic Field” which proposed that light was in fact undulations in the 
same medium that is the cause of electric and magnetic phenomena. His work in producing a unified model 
of electromagnetism is one of the greatest advances in physics. Maxwell, in collaboration with Ludwig 
Boltzmann (1844-1906), also helped develop the Maxwell—Boltzmann distribution, which is a statistical 
means of describing aspects of the kinetic theory of gases. These two discoveries helped usher in the era of 
modern physics, laying the foundation for such fields as special relativity and quantum mechanics. Boltzmann 
founded the field of statistical mechanics and was an early staunch advocate of the existence of atoms and 
molecules. 

Henri Poincaré (1854-1912) was a French theoretical physicist and mathematician. He was the first to 
present the Lorentz transformations in their modern symmetric form and discovered the remaining relativistic 
velocity transformations. Although there is similarity to Einstein’s Special Theory of Relativity, Poincaré and 
Lorentz still believed in the concept of the ether and did not fully comprehend the revolutionary philosophical 
change implied by Einstein. Poincaré worked on the solution of the three-body problem in planetary motion 
and was the first to discover a chaotic deterministic system which laid the foundations of modern chaos 
theory. It rejected the long-held deterministic view that if the position and velocities of all the particles are 
known at one time, then it is possible to predict the future for all time. 

The last two decades of the 19*” century saw the culmination of classical physics and several important 
discoveries that led to a revolution in science that toppled classical physics from its throne. The end of the 
19% century was a time during which tremendous technological progress occurred; flight, the automobile, 
and turbine-powered ships were developed, Niagara Falls was harnessed for power, etc. During this period, 
Heinrich Hertz (1857-1894) produced electromagnetic waves confirming their derivation using Maxwell’s 
equations. Simultaneously he discovered the photoelectric effect which was crucial evidence in support of 
quantum physics. Technical developments, such as photography, the induction spark coil, and the vacuum 
pump played a significant role in scientific discoveries made during the 1890’s. At the end of the 19” century, 
scientists thought that the basic laws were understood and worried that future physics would be in the fifth 
decimal place; some scientists worried that little was left for them to discover. However, there remained a 
few, presumed minor, unexplained discrepancies plus new discoveries that led to the revolution in science 
that occurred at the beginning of the 20” century. 


1.6 The 20” century revolution in physics 


The two greatest achievements of modern physics occurred at the beginning of the 20% century. The first 
was Einstein's development of the Theory of Relativity; the Special Theory of Relativity in 1905 and the 
General Theory of Relativity in 1915. This was followed in 1925 by the development of quantum mechanics. 

Albert Einstein (1879-1955) developed the Special Theory of Relativity in 1905 and the General The- 
ory of Relativity in 1915; both of these revolutionary theories had a profound impact on classical mechanics 
and the underlying philosophy of physics. The Newtonian formulation of mechanics was shown to be an 
approximation that applies only at low velocities, while the General Theory of Relativity superseded New- 
ton’s Law of Gravitation and explained the Equivalence Principle. The Newtonian concepts of an absolute 
frame of reference, plus the assumption of the separation of time and space, were shown to be invalid at 
relativistic velocities. Einstein’s postulate that the laws of physics are the same in all inertial frames requires 
a revolutionary change in the philosophy of time, space and reference frames which leads to a breakdown 
in the Newtonian formalism of classical mechanics. By contrast, the Lagrange and Hamiltonian variational 
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formalisms of mechanics, plus the principle of least action, remain intact using a relativistically invariant 
Lagrangian. The independence of the variational approach to reference frames is precisely the formalism 
necessary for relativistic mechanics. The invariance to coordinate frames of the basic field equations also 
must remain invariant for the General Theory of Relativity which also can be derived in terms of a rela- 
tivistic action principle. Thus the development of the Theory of Relativity unambiguously demonstrated the 
superiority of the variational formulation of classical mechanics over the vectorial Newtonian formulation, 
and thus the considerable effort made by Euler, Lagrange, Hamilton, Jacobi, and others in developing the 
analytical variational formalism of classical mechanics finally came to fruition at the start of the 20°” century. 
Newton’s two crowning achievements, the Laws of Motion and the Laws of Gravitation, that had reigned 
supreme since published in the Principia in 1687, were toppled from the throne by Einstein. 

Emmy Noether (1882-1935) has been described as “the greatest ever woman mathematician”. In 
1915 she proposed a theorem that a conservation law is associated with any differentiable symmetry of a 
physical system. Noether’s theorem evolves naturally from Lagrangian and Hamiltonian mechanics and 
she applied it to the four-dimensional world of general relativity. Noether’s theorem has had an important 
impact in guiding the development of modern physics. 

Other profound developments that had revolutionary impacts on classical mechanics were quantum 
physics and quantum field theory. The 1913 model of atomic structure by Niels Bohr (1885-1962) and 
the subsequent enhancements by Arnold Sommerfeld (1868-1951), were based completely on classical 
Hamiltonian mechanics. The proposal of wave-particle duality by Louis de Broglie (1892-1987), made 
in his 1924 thesis, was the catalyst leading to the development of quantum mechanics. In 1925 Werner 
Heisenberg (1901-1976), and Max Born (1882-1970) developed a matrix representation of quantum 
mechanics using non-commuting conjugate position and momenta variables. 

Paul Dirac (1902-1984) showed in his Ph.D. thesis that Heisenberg’s matrix representation of quantum 
physics is based on the Poisson Bracket generalization of Hamiltonian mechanics, which, in contrast to 
Hamilton’s canonical equations, allows for non-commuting conjugate variables. In 1926 Erwin Schrédinger 
(1887-1961) independently introduced the operational viewpoint and reinterpreted the partial differential 
equation of Hamilton-Jacobi as a wave equation. His starting point was the optical-mechanical analogy of 
Hamilton that is a built-in feature of the Hamilton-Jacobi theory. Schrédinger then showed that the wave 
mechanics he developed, and the Heisenberg matrix mechanics, are equivalent representations of quantum 
mechanics. In 1928 Dirac developed his relativistic equation of motion for the electron and pioneered the 
field of quantum electrodynamics. Dirac also introduced the Lagrangian and the principle of least action to 
quantum mechanics, and these ideas were developed into the path-integral formulation of quantum mechanics 
and the theory of electrodynamics by Richard Feynman(1918-1988). 

The concepts of wave-particle duality, and quantization of observables, both are beyond the classical 
notions of infinite subdivisions in classical physics. In spite of the radical departure of quantum mechanics 
from earlier classical concepts, the basic feature of the differential equations of quantal physics is their self- 
adjoint character which means that they are derivable from a variational principle. Thus both the Theory of 
Relativity, and quantum physics are consistent with the variational principle of mechanics, and inconsistent 
with Newtonian mechanics. As a consequence Newtonian mechanics has been dislodged from the throne 
it occupied since 1687, and the intellectually beautiful and powerful variational principles of analytical 
mechanics have been validated. 

The 2015 observation of gravitational waves is a remarkable recent confirmation of Einstein’s General 
Theory of Relativity and the validity of the underlying variational principles in physics. Another advance in 
physics is the understanding of the evolution of chaos in non-linear systems that have been made during the 
past four decades. This advance is due to the availability of computers which has reopened this interesting 
branch of classical mechanics, that was pioneered by Henri Poincaré about a century ago. Although classical 
mechanics is the oldest and most mature branch of physics, there still remain new research opportunities in 
this field of physics. 

The focus of this book is to introduce the general principles of the mathematical variational principle 
approach, and its applications to classical mechanics. It will be shown that the variational principles, that 
were developed in classical mechanics, now play a crucial role in modern physics and mathematics, plus 
many other fields of science and technology. 

References: 

Excellent sources of information regarding the history of major players in the field of classical mechanics 
can be found on Wikipedia and the book “Variational Principle of Mechanics” by Lanczos.[La49] 


Chapter 2 


Review of Newtonian mechanics 


2.1 Introduction 


It is assumed that the reader has been introduced to Newtonian mechanics applied to one or two point objects. 
This chapter reviews Newtonian mechanics for motion of many-body systems as well as for macroscopic 
sized bodies. Newton's Law of Gravitation also is reviewed. The purpose of this review is to ensure that the 
reader has a solid foundation of elementary Newtonian mechanics upon which to build the powerful analytic 
Lagrangian and Hamiltonian approaches to classical dynamics. 

Newtonian mechanics is based on application of Newton's Laws of motion which assume that the concepts 
of distance, time, and mass, are absolute, that is, motion is in an inertial frame. The Newtonian idea of 
the complete separation of space and time, and the concept of the absoluteness of time, are violated by the 
Theory of Relativity as discussed in chapter 17. However, for most practical applications, relativistic effects 
are negligible and Newtonian mechanics is an adequate description at low velocities. Therefore chapters 
2 — 16 will assume velocities for which Newton's laws of motion are applicable. 


2.2 Newton's Laws of motion 


Newton defined a vector quantity called linear momentum p which is the product of mass and velocity. 
p=mi (2.1) 


Since the mass m is a scalar quantity, then the velocity vector t and the linear momentum vector p are 
colinear. 

Newton’s laws, expressed in terms of linear momentum, are: 

1 Law of inertia: A body remains at rest or in uniform motion unless acted upon by a force. 

2 Equation of motion: A body acted upon by a force moves in such a manner that the time rate of change 
of momentum equals the force. r 

p 
Foe (2.2) 

3 Action and reaction: If two bodies exert forces on each other, these forces are equal in magnitude and 
opposite in direction. 

Newton’s second law contains the essential physics relating the force F and the rate of change of linear 
momentum p. 


Newton’s first law, the law of inertia, is a special case of Newton’s second law in that if 
dp 
Ti (2.3) 


then p is a constant of motion. 
Newton’s third law also can be interpreted as a statement of the conservation of momentum, that is, for 
a two particle system with no external forces acting, 


Fy = —Fa1 (2.4) 
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If the forces acting on two bodies are their mutual action and reaction, then equation 2.4 simplifies to 


Fiz + Fa = Pt 4 P= Sp + py) =0 (2.5) 

This implies that the total linear momentum (P = pı + p2) is a constant of motion. 
Combining equations 2.1 and 2.2 leads to a second-order differential equation 

dp dr 
dt de 
Note that the force on a body F, and the resultant acceleration a = ¥ are colinear. Appendix C2 gives 
explicit expressions for the acceleration a in cartesian and curvilinear coordinate systems. The definition of 
force depends on the definition of the mass m. Newton’s laws of motion are obeyed to a high precision for 
velocities much less than the velocity of light. For example, recent experiments have shown they are obeyed 
with an error in the acceleration of Aa < 5 x 10-14m/s?. 


F 


mi (2.6) 


2.3 Inertial frames of reference 


An inertial frame of reference is one in which Newton's Laws of 
motion are valid. It is a non-accelerated frame of reference. An 
inertial frame must be homogeneous and isotropic. Physical ex- 
periments can be carried out in different inertial reference frames. 
The Galilean transformation provides a means of converting be- 
tween two inertial frames of reference moving at a constant rel- 
ative velocity. Consider two reference frames O and O” with O’ 
moving with constant velocity V at time t. Figure 2.1 shows a 
Galilean transformation which can be expressed in vector form. 


r= r-Vt (2.7) 
E 
Equation 2.7 gives the boost, assuming Newton's hypothesis 
that the time is invariant to change of inertial frames of reference. 
The time differential of this transformation gives 
r= t-V (2.8) 
3 - Figure 2.1: Frame O” moving with a con- 


P = f 
stant velocity V with respect to frame O 
Note that the forces in the primed and unprimed inertial frames at the time t. 


are related by 


F ce mi =mP' = F’ (2.9) 
Thus Newton's Laws of motion are invariant under a Galilean transformation, that is, the inertial mass is 
unchanged under Galilean transformations. If Newton’s laws are valid in one inertial frame of reference, 
then they are valid in any frame of reference in uniform motion with respect to the first frame of reference. 
This invariance is called Galilean invariance. There are an infinite number of possible inertial frames all 
connected by Galilean transformations. 

Galilean invariance violates Einstein’s Theory of Relativity. In order to satisfy Einstein’s postulate 
that the laws of physics are the same in all inertial frames, as well as satisfy Maxwell’s equations for 
electromagnetism, it is necessary to replace the Galilean transformation by the Lorentz transformation. As 
will be discussed in chapter 17, the Lorentz transformation leads to Lorentz contraction and time dilation both 
of which are related to the parameter y = > 7 y where c is the velocity of light in vacuum. Fortunately, 
most situations in life involve velocities where v << c; for example, for a body moving at 25,000m.p.h. 
(11,111 m/s) which is the escape velocity for a body at the surface of the earth, the y factor differs from 
unity by about 6.81107!% which is negligible. Relativistic effects are significant only in nuclear and particle 
physics as well as some exotic conditions in astrophysics. Thus, for the purpose of classical mechanics, 
usually it is reasonable to assume that the Galilean transformation is valid and is well obeyed under most 
practical conditions. 
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2.4 First-order integrals in Newtonian mechanics 


A fundamental goal of mechanics is to determine the equations of motion for an n—body system, where 
the force F; acts on the individual mass m; where 1 < i < n. Newton’s second-order equation of motion, 
equation 2.6 must be solved to calculate the instantaneous spatial locations, velocities, and accelerations for 
each mass m; of an n-body system. Both F; and F; are vectors, each having three orthogonal components. 
The solution of equation 2.6 involves integrating second-order equations of motion subject to a set of initial 
conditions. Although this task appears simple in principle, it can be exceedingly complicated for many-body 
systems. Fortunately, solution of the motion often can be simplified by exploiting three first-order integrals 
of Newton’s equations of motion, that are related directly to conservation of either the linear momentum, 
angular momentum, or energy of the system. In addition, for the special case of these three first-order 
integrals, the internal motion of any many-body system can be factored out by a simple transformations into 
the center of mass of the system. As a consequence, the following three first-order integrals are exploited 
extensively in classical mechanics. 


2.4.1 Linear Momentum 


Newton’s Laws can be written as the differential and integral forms of the first-order time integral which 
equals the change in linear momentum. That is 


d i 6 E d i 
F; = p f F;dt = / T dt = (p2 — P1); (2.10) 
1 1 


dt 


This allows Newton’s law of motion to be expressed directly in terms of the linear momentum p; = m;t; of 

each of the 1 < i < n bodies in the system. This first-order time integral features prominently in classical 
mechanics since it connects to the important concept of linear momentum p. This first-order time integral 
gives that the total linear momentum is a constant of motion when the sum of the external forces is zero. 


2.4.2 Angular momentum 


The angular momentum L; of a particle ¿ with linear momentum p; with respect to an origin from which 
the position vector r; is measured, is defined by 


L =r X pi (2.11) 
The torque, or moment of the force N; with respect to the same origin is defined to be 


where r; is the position vector from the origin to the point where the force F; is applied. Note that the 
torque N; can be written as 


dp; 
Ner 2.1 
rix (2.13) 
Consider the time differential of the angular momentum, dra 
dL; n= d dr; dp; 
dt = di (ri x Pi) = dt x Pi HTX dt (2.14) 
However, 
dr; dr; dr; 
p= = 2.1 
ee a ae = ate 


Equations 2.13 — 2.15 can be used to write the first-order time integral for angular momentum in either 
differential or integral form as 
dL; dp; 
== 
d “da 


2 2 dL; 
1 1 dt 


Newton’s Law relates torque and angular momentum about the same axis. When the torque about any axis 
is zero then angular momentum about that axis is a constant of motion. If the total torque is zero then the 
total angular momentum, as well as the components about three orthogonal axes, all are constants. 
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2.4.3 Kinetic energy 


The third first-order integral, that can be used for solving the equations of motion, is the first-order spatial 
integral i F; - dri. Note that this spatial integral is a scalar in contrast to the first-order time integrals for 
linear and angular momenta which are vectors. The work done on a mass m; by a force F; in transforming 
from condition 1 to 2 is defined to be 


2 
[Wi2]; = f F; - dr; (2.17) 
1 
If F; is the net resultant force acting on a particle i, then the integrand can be written as 


Pug E O E ag eae 


— 1 2 = 
dt “dt dt dt ee a (Gre?) ee IA 


where the kinetic energy of a particle ¿ is defined as 

1 

[T]; = ¿ivi (2.19) 
Thus the work done on the particle i, that is, [W12]; equals the change in kinetic energy of the particle if 


there is no change in other contributions to the total energy such as potential energy, heat dissipation, etc. 
That is 


dl 1 
[Wie], = Ez = gral] = (Ih — T1], (2.20) 
Thus the differential, and corresponding first integral, forms of the kinetic energy can be written as 
dT; ; 
F, = I F, - dr; = (T2 — Tı )i (2.21) 
dri 1 


If the work done on the particle is positive, then the final kinetic energy Tə > Tı. Especially noteworthy is that 
the kinetic energy [T]; is a scalar quantity which makes it simple to use. This first-order spatial integral is the 
foundation of the analytic formulation of mechanics that underlies Lagrangian and Hamiltonian mechanics. 


2.5 Conservation laws in classical mechanics 


Elucidating the dynamics in classical mechanics is greatly simplified when conservation laws are applicable. 
In nature, isolated many-body systems frequently conserve one or more of the first-order integrals for linear 
momentum, angular momentum, and mass/energy. Note that mass and energy are coupled in the Theory 
of Relativity, but for non-relativistic mechanics the conservation of mass and energy are decoupled. Other 
observables such as lepton and baryon numbers are conserved, but these conservation laws usually can be 
subsumed under conservation of mass for most problems in non-relativistic classical mechanics. The power 
of conservation laws in calculating classical dynamics makes it useful to combine the conservation laws 
with the first integrals for linear momentum, angular momentum, and work-energy, when solving problems 
involving Newtonian mechanics. These three conservation laws will be derived assuming Newton’s laws of 
motion, however, these conservation laws are fundamental laws of nature that apply well beyond the domain 
of applicability of Newtonian mechanics. 


2.6 Motion of finite-sized and many-body systems 


Elementary presentations in classical mechanics discuss motion and forces involving single point particles. 
However, in real life, single bodies have a finite size introducing new degrees of freedom such as rotation and 
vibration, and frequently many finite-sized bodies are involved. A finite-sized body can be thought of as a 
system of interacting particles such as the individual atoms of the body. The interactions between the parts 
of the body can be strong which leads to rigid body motion where the positions of the particles are held 
fixed with respect to each other, and the body can translate and rotate. When the interaction between the 
bodies is weaker, such as for a diatomic molecule, additional vibrational degrees of relative motion between 
the individual atoms are important. Newton’s third law of motion becomes especially important for such 
many-body systems. 
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2.7 Center of mass of a many-body system 


A finite sized body needs a reference point with respect 
to which the motion can be described. For example, 
there are 8 corners of a cube that could server as ref- 
erence points, but the motion of each corner is compli- 
cated if the cube is both translating and rotating. The 
treatment of the behavior of finite-sized bodies, or many- 
body systems, is greatly simplified using the concept of 
center of mass. The center of mass is a particular fixed 
point in the body that has an especially valuable prop- 
erty; that is, the translational motion of a finite sized 
body can be treated like that of a point mass located at A 
the center of mass. In addition the translational motion 
is separable from the rotational-vibrational motion of a 
many-body system when the motion is described with 
respect to the center of mass. Thus it is convenient at 
this juncture to introduce the concept of center of mass 
of a many-body system. 

For a many-body system, the position vector r;, de- 
fined relative to the laboratory system, is related to the Figure 2.2: Position vector with respect to the 
position vector r; with respect to the center of mass, and center of mass. 
the center-of-mass location R relative to the laboratory 
system. That is, as shown in figure 2.2 

r¡=R+r; (2.22) 


This vector relation defines the transformation between the laboratory and center of mass systems. For 
discrete and continuous systems respectively, the location of the center of mass is uniquely defined as being 
where 


y mir; = Jroa =0. (Center of mass definition) 


Define the total mass M as A 
M= S mi = / pdV (Total mass) 
y body 


The average location of the system corresponds to the location of the center of mass since +7 yo, mir, =0, 
that is i i 
i i 


The vector R, which describes the location of the center of mass, depends on the origin and coordinate 
system chosen. For a continuous mass distribution the location vector of the center of mass is given by 


1 1 
R=>7 2 = f rpdV (2.24) 


The center of mass can be evaluated by calculating the individual components along three orthogonal axes. 

The center-of-mass frame of reference is defined as the frame for which the center of mass is stationary. 
This frame of reference is especially valuable for elucidating the underlying physics which involves only the 
relative motion of the many bodies. That is, the trivial translational motion of the center of mass frame, 
which has no influence on the relative motion of the bodies, is factored out and can be ignored. For example, 
a tennis ball (0.06kg) approaching the earth (6 x 10?4kg) with velocity v could be treated in three frames, 
(a) assume the earth is stationary, (b) assume the tennis ball is stationary, or (c) the center-of-mass frame. 
The latter frame ignores the center of mass motion which has no influence on the relative motion of the 
tennis ball and the earth. The center of linear momentum and center of mass coordinate frames are identical 
in Newtonian mechanics but not in relativistic mechanics as described in chapter 17.4.3. 
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2.8 Total linear momentum of a many-body system 


2.8.1 Center-of-mass decomposition 


The total linear momentum P for a system of n particles is given by 


n n 
d 
P=) pi ==» mur (2.25) 
i i 
It is convenient to describe a many-body system by a position vector r; with respect to the center of mass. 
ri=R+r; (2.26) 


That is, 


Z di d A ia d ) 
= a iri = — = ¡== = MR 2.2 
P 2 Pi ai mr gh DA ¿MR+0 (2.27) 


since J`; mr; = 0 as given by the definition of the center of mass. That is; 
P=MR (2.28) 
Thus the total linear momentum for a system is the same as the momentum of a single particle of mass 


M =>; m; located at the center of mass of the system. 


2.8.2 Equations of motion 


The force acting on particle ¿, in an n-particle many-body system, can be separated into an external force 
FF** plus internal forces f;; between the n particles of the system 
n 
F,=FP +) fi; (2.29) 
ij 


The origin of the external force is from outside of the system while the internal force is due to the mutual 
interaction between the n particles in the system. Newton’s Law tells us that 


bi =F; =F7 +Y fy (2.30) 
dj 


Thus the rate of change of total momentum is 


P= 5i = 5 FP + 5 5 fij (2.31) 
i i a j 
¡Aj 


Note that since the indices are dummy then 


IN = eh (2.32) 


14j tAj 
Substituting Newton’s third law f;; = —f;; into equation 2.32 implies that 


dfs =D fe =— DD fis = 0 (2.33) 


ig 
ifj 14) Aj 
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which is satisfied only for the case where the summations equal zero. That is, for every internal force, there 
is an equal and opposite reaction force that cancels that internal force. 

Therefore the first-order integral for linear momentum can be written in differential and integral forms 
as 


2 n 
P=) Ey Jete =P,-P; (2.34) 
i 1 i 


The reaction of a body to an external force is equivalent to a single particle of mass M located at the center 
of mass assuming that the internal forces cancel due to Newton’s third law. 
Note that the total linear momentum P is conserved if the net external force FË is zero, that is 


dP 
FP =— =0 


=— = (2.35) 


Therefore the P of the center of mass is a constant. Moreover, if the component of the force along any 
direction € is zero, that is, 
dP-e 
=0 2.36 
H (2.36) 
then P -8 is a constant. This fact is used frequently to solve problems involving motion in a constant force 
field. For example, in the earth’s gravitational field, the momentum of an object moving in vacuum in the 
vertical direction is time dependent because of the gravitational force, whereas the horizontal component of 
momentum is constant if no forces act in the horizontal direction. 


F*.$= 


2.1 Example: Exploding cannon shell 


Consider a cannon shell of mass M moves along a parabolic trajectory in the earths gravitational field. 
An internal explosion, generating an amount E of mechanical energy, blows the shell into two parts. One 
part of mass kM, where k < 1, continues moving along the same trajectory with velocity v’ while the other 
part is reduced to rest. Find the velocity of the mass kM immediately after the explosion. 

It is important to remember that the energy release E is given in 
the center of mass. If the velocity of the shell immediately before the 


explosion is v and v' is the velocity of the kM part immediately after the aN y 
explosion, then energy conservation gives that Mv? +E = $kMv?T . Ss 
The conservation of linear momentum gives Mv = kMv'. Eliminating Y | \ Ne 


v from these equations gives 


a 
Exploding cannon shell 


2.2 Example: Billiard-ball collisions 


A billiard ball with mass m and incident velocity v collides with an identical stationary ball. Assume that 
the balls bounce off each other elastically in such a way that the incident ball is deflected at a scattering angle 
0 to the incident direction. Calculate the final velocities vp and Vf of the two balls and the scattering angle 
ġo of the target ball. The conservation of linear momentum in the incident direction x, and the perpendicular 
direction give 


mu = mu cos 0 + mV; cos ġ 0 = mvp sind — mV; sing 
Energy conservation gives . 
mio _ mM 2, Ma 
a” = 9 Uf + a VF 


Solving these three equations gives $ = 90° — 0, that is, the balls bounce off perpendicular to each other in 
the laboratory frame. The final velocities are 


vf = v cos 0 Vy = vsind 
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2.9 Angular momentum of a many-body system 


2.9.1 Center-of-mass decomposition 


As was the case for linear momentum, for a many-body system it is possible to separate the angular mo- 
mentum into two components. One component is the angular momentum about the center of mass and the 
other component is the angular motion of the center of mass about the origin of the coordinate system. This 
separation is done by describing the angular momentum of a many-body system using a position vector r; 
with respect to the center of mass plus the vector location R of the center of mass. 


r,;=R+ r; (2.37) 


The total angular momentum 


L 


N n 
SOL =r; x Pi 
i i 


= Y R+r) x mi (R ++) 


2 


= domi |r, x +r} RR + RR (2.38) 


Note that if the position vectors are with respect to the center of mass, then y m;r; = 0 resulting in the 
middle two terms in the bracket being zero, that is; 


L= r;xp;+RxP (2.39) 


The total angular momentum separates into two terms, the angular momentum about the center of mass, 
plus the angular momentum of the center of mass about the origin of the axis system. This factoring of the 
angular momentum only applies for the center of mass. This is called Samuel König’s first theorem. 

2.9.2 Equations of motion 


The time derivative of the angular momentum 


: d 
L; = g Xx Pi =Å; X Pi tri X Pi (2.40) 
But 


Thus the torque N; acting on mass 7 is given by 
N; = L; =r; x Di =r; xX F; (2.42) 
Consider that the resultant force acting on particle ¿ in this n-particle system can be separated into an 
external force FF”t plus internal forces between the n particles of the system 
n 
F; =F +) fy (2.43) 
J 
14j 
The origin of the external force is from outside of the system while the internal force is due to the interaction 
with the other n — 1 particles in the system. Newton's Law tells us that 
n 


2. 
tAj 


2.9. ANGULAR MOMENTUM OF A MANY-BODY SYSTEM 17 


The rate of change of total angular momentum is 


14) 
Since f;; = —f;; the last expression can be written as 
Y Y rx ty => Y (1: 13) x fiy (2.46) 
i j i j 
ij 1<j 


Note that (r; — rs) is the vector r;; connecting j to i. For central forces the force vector fij = fijfij thus 


5> (ri = r;) x fij = a x il =0 (2.47) 
i j J 


i 
i<j i<j 

That is, for central internal forces the total internal torque on a system of particles is zero, and the rate of 

change of total angular momentum for central internal forces becomes 


L= > nxE?= > Nf =N" (2.48) 


where N® is the net external torque acting on the system. Equation 2.48 leads to the differential and integral 
forms of the first integral relating the total angular momentum to total external torque. 


2 
L= NF [rea = L; — L; (2.49) 


1 


Angular momentum conservation occurs in many problems involving zero external torques NË = 0, plus 
two-body central forces F =f(r)f since the torque on the particle about the center of the force is zero 


N =r x F=f(r)[r x $] =0 (2.50) 


Examples are, the central gravitational force for stellar or planetary systems in astrophysics, and the central 
electrostatic force manifest for motion of electrons in the atom. In addition, the component of angular 
momentum about any axis L.é is conserved if the net external torque about that axis N.é =0. 


2.3 Example: Bolas thrown by gaucho 


Consider the bolas thrown by a gaucho to catch cattle. This is a 
system with conserved linear and angular momentum about certain 
axes. When the bolas leaves the gaucho’s hand the center of mass 
has a linear velocity V plus an angular momentum about the center 
of mass of L. If no external torques act, then the center of mass of 
the bolas will follow a typical ballistic trajectory in the earth’s grav- 
itational field while the angular momentum vector L is conserved, 
that is, both in magnitude and direction. The tension in the ropes 
connecting the three balls does not impact the motion of the system 
as long as the ropes do not snap due to centrifugal forces. 


Bolas thrown by a gaucho 
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2.10 Work and kinetic energy for a many-body system 


2.10.1 Center-of-mass kinetic energy 


For a many-body system the position vector r; with respect to the center of mass is given by. 
r¡=R+r; (2.51) 


The location of the center of mass is uniquely defined as being at the location where f pridV = 0. The 
velocity of the it” particle can be expressed in terms of the velocity of the center of mass R plus the velocity 
of the particle with respect to the center of mass t; . That is, 


i= R+, (2.52) 


The total kinetic energy T is 


For the special case of the center of mass, the middle term is zero since, by definition of the center of mass, 
Xo; mir, = 0. Therefore 


“1 1 
T= y ¿nio + ¿Mv? (2.54) 


Thus the total kinetic energy of the system is equal to the sum of the kinetic energy of a mass M moving 
with the center of mass velocity plus the kinetic energy of motion of the individual particles relative to the 
center of mass. This is called Samuel Kónig's second theorem. 

Note that for a fixed center-of-mass energy, the total kinetic energy T has a minimum value of 77 4mjv} 
when the velocity of the center of mass V = 0. For a given internal excitation energy, the minimum energy 
required to accelerate colliding bodies occurs when the colliding bodies have identical, but opposite, linear 
momenta. That is, when the center-of-mass velocity V = 0. 


2.10.2 Conservative forces and potential energy 


In general, the line integral of a force field F, that is, f F-dr, is both path and time dependent. However, 
an important class of forces, called conservative forces, exist for which the following two facts are obeyed. 


1) Time independence: 

The force depends only on the particle position r, that is, it does not depend on velocity or time. 

2) Path independence: 

For any two points 1 and 2, the work done by F is independent of the path taken between 1 and 2. 


If forces are path independent, then it is possible to define a scalar field, called potential energy, denoted 
by U(r), that is only a function of position. The path independence can be expressed by noting that the 
integral around a closed loop is zero. That is 


PP -dr=0 (2.55) 


Applying Stokes theorem for a path-independent force leads to the alternate statement that the curl is zero. 
See appendix G.3.3. 
VxF=0. (2.56) 


Note that the vector product of two del operators V acting on a scalar field U equals 
Vx VU =0 (2.57) 
Thus it is possible to express a path-independent force field as the gradient of a scalar field, U, that is 


F =-VU (2.58) 
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Then the spatial integral 
3 2 
1 1 


Thus for a path-independent force, the work done on the particle is given by the change in potential energy 
if there is no change in kinetic energy. For example, if an object is lifted against the gravitational field, then 
work is done on the particle and the final potential energy U2 exceeds the initial potential energy, U1. 


2.10.3 Total mechanical energy 


The total mechanical energy E of a particle is defined as the sum of the kinetic and potential energies. 
E=T+U (2.60) 


Note that the potential energy is defined only to within an additive constant since the force F = -VU 
depends only on difference in potential energy. Similarly, the kinetic energy is not absolute since any inertial 
frame of reference can be used to describe the motion and the velocity of a particle depends on the relative 
velocities of inertial frames. Thus the total mechanical energy E = T + U is not absolute. 

If a single particle is subject to several path-independent forces, such as gravity, linear restoring forces, 


etc., then a potential energy U; can be ascribed to each of the m forces where for each force F; = —VU;. In 
m 


contrast to the forces, which add vectorially, these scalar potential energies are additive, U = 5 U;. Thus 
i 
the total mechanical energy for m potential energies equals 


E=T+U(r = 7+ 20 (2.61) 


The time derivative of the total mechanical energy E = T + U, equals 


dE dr du 
Boye a 2.62 
di dt’ dt ae) 
Equation 2.18 gave that dT = F - dr. Thus, the first term in equation 2.62 equals 
dT dr 
— =F. — 2.63 
dt dt ( ) 


The potential energy can be a function of both position and time. Thus the time difference in potential 
energy due to change in both time and position is given as 


o OU dri A: OU 


Ox; dt | Ot Faves uT “Ot ae 


The time derivative of the total mechanical energy is given using equations 2.63, 2.64 in equation 2.62. 


dE dT dU dr dr OU dr OU 
an aac ae VA rd [F + (VU)]- oor (2.65) 


Note that if the field is path independent, that is V x F = 0, then the force and potential are related by 
F=-VU (2.66) 


Therefore, for path independent forces, the first term in the time derivative of the total energy in equation 
2.65 is zero. That is, 
dE QU 
dt ôt 
In addition, when the potential energy U is not an explicit function of time, then C= = 0 and thus the total 
energy is conserved. That is, for the combination of (a) path independence plus (b) time independence, then 
the total energy of a conservative field is conserved. 


(2.67) 
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Note that there are cases where the concept of potential still is useful even when it is time dependent. 
That is, if path independence applies, i.e. F = —VU at any instant. For example, a Coulomb field problem 
where charges are slowly changing due to leakage etc., or during a peripheral collision between two charged 
bodies such as nuclei. 


2.4 Example: Central force 


A particle of mass m moves along a trajectory given by x = xy cosw yt and y = yo sin wot. 

a) Find the x and y components of the force and determine the condition for which the force is a central 
force. 

Differentiating with respect to time gives 


t = —2pW01 sin (wit) # = — xow? cos (wt) 
Y = —Yow2 cos (wat) ü = —yow? sin (wot) 
Newton’s second law gives 
F=m (#+49) = —m [xow] cos (wit)? + yows sin (wat) j] = —m [wjxi + way] 
Note that if wı = w2 =w then 
= = —mu? [st + yj] = —mw?r 


That is, it is a central force if wy, = w2 = w. 
b) Find the potential energy as a function of x and y. 
Since 


ðU <] 
—j+—— 


F = — = — 
we É By” 


then i 
U= gm (wiz? + wy”) 


assuming that U = 0 at the origin. 
c) Determine the kinetic energy of the particle and show that it is conserved. 
The total energy 


E=T+U = ym (è? +59) + Em (fa? + uy?) = im (aud + gu) 


since cos?0 + sin? 0 = 1. Thus the total energy E is a constant and is conserved. 


2.10.4 Total mechanical energy for conservative systems 


Equation 2.20 showed that, using Newton’s second law, F = $, the first-order spatial integral gives that 


the work done, W12, is related to the change in the kinetic energy. That is, 


2 

1 1 

Wi = f F- dr = gm — ¿mo =T —T, (2.68) 
1 


The work done W12 also can be evaluated in terms of the known forces F; in the spatial integral. 
Consider that the resultant force acting on particle ¿ in this n-particle system can be separated into an 
external force FF** plus internal forces between the n particles of the system 
n 
F,=FP +) fy (2.69) 
J 
14j 
The origin of the external force is from outside of the system while the internal force is due to the interaction 
with the other n — 1 particles in the system. Newton’s Law tells us that 
n 
J 
14) 
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The work done on the system by a force moving from configuration 1 — 2 is given by 
n 2 n n 2 
7 1 A ; 1 
2 l eJ 
14) 


Since f;; = —f;; then 


n 2 n n 2 
Wi NN El FF : dr; + de > fij : (dri == drj) (2.72) 
i Yl i j YI 


i<j 
where dr; — dr; = dr;¿ is the vector from j to 1. 
Assume that both the external and internal forces are conservative, and thus can be derived from time 
independent potentials, that is 
FF = -V,UF" (2.73) 
fi; = -VUJ (2.74) 
Then 


Wiss = = D VU 7% - dr; — SED Vili -drij 
1 i j 


i<j 
= AUDE U Ot 0) 
U= (1) - UE(2) + UT (1) — (2) (2.75) 
Define the total external potential energy, 
See (2.76) 
and the total internal energy 
git Se Ue (2.77) 


Equating the two equivalent equations for W1—2, that is 2.68 and 2.75.gives that 
Wiest — T} = (SFr 2) + U™ (1) — U™ (2) (2.78) 
Regroup these terms in equation 2.78 gives 
Ti +U” (1) + U™* (1) = To + U” (2) + U7™ (2) 
This shows that, for conservative forces, the total energy is conserved and is given by 
E =T 4+ U#* 4 U™ (2.79) 


The three first-order integrals for linear momentum, angular momentum, and energy provide powerful 
approaches for solving the motion of Newtonian systems due to the applicability of conservation laws for the 
corresponding linear and angular momentum plus energy conservation for conservative forces. In addition, 
the important concept of center-of-mass motion naturally separates out for these three first-order integrals. 
Although these conservation laws were derived assuming Newton’s Laws of motion, these conservation laws 
are more generally applicable, and these conservation laws surpass the range of validity of Newton’s Laws of 
motion. For example, in 1930 Pauli and Fermi postulated the existence of the neutrino in order to account for 
non-conservation of energy and momentum in -decay because they did not wish to relinquish the concepts 
of energy and momentum conservation. The neutrino was first detected in 1956 confirming the correctness 
of this hypothesis. 
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2.11 Virial Theorem 


The Virial theorem is an important theorem for a system of moving particles both in classical physics and 
quantum physics. The Virial Theorem is useful when considering a collection of many particles and has a 
special importance to central-force motion. For a general system of mass points with position vectors r; and 
applied forces F;, consider the scalar product G 


G= 5 Pi- Tri (2.80) 
where 7 sums over all particles. The time derivative of G is 
dG ; a 


However, 
Spi => mi; Ej = Nm? =2T (2.82) 
Also, since p; = F; 


So Bi T; = SOF; Tj (2.83) 


Thus 


dG 
anes a Fj -ri (2.84) 
The time average over a period T is 
1 f dG G(r) — G(0) 
= — dt = — = (2T F; - 4; 2. 
T I dt T east 2 : (2:89) 
where the () brackets refer to the time average. Note that if the motion is periodic and the chosen time 7 
G(7)—G(0) 


equals a multiple of the period, then = 0. Even if the motion is not periodic, if the constraints and 
velocities of all the particles remain finite, then there is an upper bound to G. This implies that choosing 
Gr) CO) — 0. In both cases the left-hand side of the equation tends to zero giving the 


(T) = => 2 F;- 5) (2.86) 


The right-hand side of this equation is called the Virial of the system. For a single particle subject to a 


T — œ means that 
Virial theorem 


conservative central force F = —VU the Virial theorem equals 
1 1 / ðU 
T =- . = = — : 
(T) 5 (VU -r) 5 (r ar ) (2.87) 


If the potential is of the form U = kr”+! that is, F = —k(n + 1)r”, then r 2 = (n+1)U. Thus for a single 
particle in a central potential U = kr"+! the Virial theorem reduces to 


a) ="** wy (2.88) 


The following two special cases are of considerable importance in physics. 
Hooke’s Law: Note that for a linear restoring force n = 1 then 


(T) =+(0) (n=1) 


You may be familiar with this fact for simple harmonic motion where the average kinetic and potential 
energies are the same and both equal half of the total energy. 
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Inverse-square law: The other interesting case is for the inverse square law n = —2 where 
1 
(T) = -3 (0) a, 
The Virial theorem is useful for solving problems in that knowing the exponent n of the field makes it 
possible to write down directly the average total energy in the field. For example, for n = —2 
1 1 
(E) = (T) + (U) = -3 (U) + (U) = 5 (U) (2.89) 


This occurs for the Bohr model of the hydrogen atom where the kinetic energy of the bound electron is half 
of the potential energy. The same result occurs for planetary motion in the solar system. 


2.5 Example: The ideal gas law 


The Virial theorem deals with average properties and has applications to statistical mechanics. Consider 
an ideal gas. According to the Equipartition theorem the average kinetic energy per atom in an ideal gas is 
3kT where T is the absolute temperature and k is the Boltzmann constant. Thus the average total kinetic 
energy for N atoms is (KE) = ¿NKT. The right-hand side of the Virial theorem contains the force F;. For 
an ideal gas it is assumed that there are no interaction forces between atoms, that is the only force is the 
force of constraint of the walls of the pressure vessel. The pressure P is force per unit area and thus the 
instantaneous force on an area of wall dA is dF; = —fiPdA where ñ designates the unit vector normal to 
the surface. Thus the right-hand side of the Virial theorem is 


1 Pf. 


Use of the divergence theorem thus gives that f ñ-ridA = f V-rdV = 3 f dV = 3V. Thus the Virial theorem 
leads to the ideal gas law, that is 
NkT = PV 


2.6 Example: The mass of galaxies 


The Virial theorem can be used to make a crude estimate of the mass of a cluster of galaxies. Assuming a 
spherically-symmetric cluster of N galaxies, each of mass m, then the total mass of the cluster is M = Nm. 
A crude estimate of the cluster potential energy is 


2 
wa EE (a) 


where R is the radius of a cluster. The average kinetic energy per galaxy is $m (uy? where (v) is the average 
square of the galaxy velocities with respect to the center of mass of the cluster. Thus the total kinetic energy 
of the cluster is 
Nmlv? Mwy 
KEN ee (6) 
2 2 
The Virial theorem tells us that a central force having a radial dependence of the form F x r” gives (KE) = 


“tl (U). For the inverse-square gravitational force then 


(KE) =-5 (0) ) 


Thus equations a, B and y give an estimate of the total mass of the cluster to be 
Riv)” 
G 


This estimate is larger than the value estimated from the luminosity of the cluster implying a large amount 
of “dark matter” must exist in galaxies which remains an open question in physics. 


M x 
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2.12 Applications of Newton’s equations of motion 


Newton’s equation of motion can be written in the form 


d dv d?r 
mS T Ma ae oy) 
A description of the motion of a particle requires a solution of this second-order differential equation of 
motion. This equation of motion may be integrated to find r(t) and v(t) if the initial conditions and 
the force field F(t) are known. Solution of the equation of motion can be complicated for many practical 
examples, but there are various approaches to simplify the solution. It is of value to learn efficient approaches 
to solving problems. 

The following sequence is recommended 

a) Make a vector diagram of the problem indicating forces, velocities, etc. 

b) Write down the known quantities. 

c) Before trying to solve the equation of motion directly, look to see if a basic conservation law applies. 
That is, check if any of the three first-order integrals, can be used to simplify the solution. The use of 
conservation of energy or conservation of momentum can greatly simplify solving problems. 

The following examples show the solution of typical types of problem encountered using Newtonian 
mechanics. 


2.12.1 Constant force problems 


Problems having a constant force imply constant acceleration. The classic example is a block sliding on an 
inclined plane, where the block of mass m is acted upon by both gravity and friction. The net force F is 
given by the vector sum of the gravitational force F}, normal force N and frictional force fy. 


F=F,+N+f;=ma (2.91) 
Taking components perpendicular to the inclined plane in the y direction 
—F,cos6+ N=0 (2.92) 
That is, since F} = mg, 
N = mg cos 8 (2.93) 


Similarly, taking components along the inclined plane in the x di- 
rection 


; dz 
Fysin@ — fẹ sra (2.94) 
Using the concept of coefficient of friction p, 
1 =uN (2.95) 
Thus the equation of motion can be written as 
d2 
mg (sin 0 — u cos 0) = ms (2.96) 
The block accelerates if sin? > pcos, that is, tan? > pw. The 
acceleration is constant if u and 0 are constant, that is Figure 2.3: Block on an inclined plane 
d? 
a = g (sin 0 — u cos 0) (2.97) 


Remember that if the block is stationary, the friction coefficient balances such that (sin — pcos 0) = 0, 
that is, tan 0 = u. However, there is a maximum static friction coefficient j1g beyond which the block starts 
sliding. The kinetic coefficient of friction ug is applicable for sliding friction and usually pg < Hg- 
Another example of constant force and acceleration is motion of objects free falling in a uniform gravi- 
tational field when air drag is neglected. Then one obtains the simple relations such as v = u + at, etc. 
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2.12.2 Linear Restoring Force 


An important class of problems involve a linear restoring force, that is, they obey Hooke’s law. The equation 
of motion for this case is 
F(x) = -kx = mi (2.98) 


It is usual to define 
B (2.99) 
m 


IIl 


wo 
Then the equation of motion then can be written as 
#+wie = 0 (2.100) 


which is the equation of the harmonic oscillator. Examples are small oscillations of a mass on a spring, 
vibrations of a stretched piano string, etc. 
The solution of this second order equation is 


x(t) = Asin (wot — ô) (2.101) 


This is the well known sinusoidal behavior of the displacement for the simple harmonic oscillator. The 


angular frequency wo is 
l-k 
= \/— 2.102 
=) m ( ) 


Note that this linear system has no dissipative forces, thus the total energy is a constant of motion as 
discussed previously. That is, it is a conservative system with a total energy E given by 


1 1 
gin + 5 he =E (2.103) 


The first term is the kinetic energy and the second term is the potential energy. The Virial theorem gives 
that for the linear restoring force the average kinetic energy equals the average potential energy. 


2.12.3 Position-dependent conservative forces 


The linear restoring force is an example of a conservative field. The total energy E is conserved, and if the 
field is time independent, then the conservative forces are a function only of position. The easiest way to 
solve such problems is to use the concept of potential energy U illustrated in Figure 2.4. 


2 
Usa = Ur = -f F. dr (2.104) 
1 


Consider a conservative force in one dimension. Since it was shown that the total energy E = T + U is 
conserved for a conservative field, then 


1 
E=T+U = ¿mo? + U(x) (2.105) 
Therefore: 
dx 2 
A E 2.106 
T = H [E Uta) (2.106) 
Integration of this gives 
E +d. 
t— to =a) eee (2.107) 


» /21E-U(0)] 


where x = xy when t = to. Knowing U(x) it is possible to solve this equation as a function of time. 
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It is possible to understand the general features of the 
solution just from inspection of the function U(x). For ex- 
ample, as shown in figure 2.4 the motion for energy Ej 
is periodic between the turning points £a and zy. Since 
the potential energy curve is approximately parabolic be- 
tween these limits the motion will exhibit simple harmonic 
motion. For Eo the turning point coalesce to xp, that is 
there is no motion. For total energy Ez the motion is 
periodic in two independent regimes, £e < £ < xq, and 
Le < x < xy. Classically the particle cannot jump from 
one pocket to the other. The motion for the particle with 
total energy Ez is that it moves freely from infinity, stops 
and rebounds at x = x, and then returns to infinity. That 
is the particle bounces off the potential at x,. For energy 
E, the particle moves freely and is unbounded. For all 
these cases, the actual velocity is given by the above re- 
lation for v(x). Thus the kinetic energy is largest where 
the potential is deepest. An example would be motion of 
a roller coaster car. 

Position-dependent forces are encountered extensively 
in classical mechanics. Examples are the many manifesta- 
tions of motion in gravitational fields, such as interplane- 


U(x 


Asp) 


Figure 2.4: One-dimensional potential U(x). 


tary probes, a roller coaster, and automobile suspension systems. The linear restoring force is an especially 
simple example of a position-dependent force while the most frequently encountered conservative potentials 
are in electrostatics and gravitation for which the potentials are; 


1 qq 
U = 
(r) ÅT Eo T? 
mMm 
U(r)=-G 2 
12 


(Electrostatic potential energy) 


(Gravitational potential energy) 


Knowing U(r) it is possible to solve the equation of motion as a function of time. 


2.7 Example: Diatomic molecule 


An example of a conservative field is a vibrating diatomic molecule which has a potential energy depen- 
dence with separation distance x that is described approximately by the Morse function 


_ (w=29) 72 
wee | Uo 


where Up, xo, and 6 are parameters chosen to best describe the particular pair of atoms. The restoring force 


U(x) = Uo [1 —e 
is given by da o 
x 
a) de 2 5 |: 


This has a minimum value of U (zo) = Uo at x = Xo. 
Note that for small amplitude oscillations, where 


(x — xo) << ô 


the exponential term in the potential function can be ex- 
panded to give 


a E= a Uo = 


U(x) y Uo 1 5 3? (a to)” Uo 
This gives a restoring force 
Peas Oe 
dx ô 


That is, for small amplitudes the restoring force is linear. 


=< | e] 
3 e ò 


Potential energy function U (x)/Uo versus x/0 
for the diatomic molecule. 
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2.12.4 Constrained motion 


A frequently encountered problem involving position dependent forces, is when the motion is constrained to 
follow a certain trajectory. Forces of constraint must exist to constrain the motion to a specific trajectory. 
Examples are, the roller coaster, a rolling ball on an undulating surface, or a downhill skier, where the 
motion is constrained to follow the surface or track contours. The potential energy can be evaluated at all 
positions along the constrained trajectory for conservative forces such as gravity. However, the additional 
forces of constraint that must exist to constrain the motion, can be complicated and depend on the motion. 
For example, the roller coaster must always balance the gravitational and centripetal forces. Fortunately 
forces of constraint Fç often are normal to the direction of motion and thus do not contribute to the total 
mechanical energy since then the work done Fc - dl is zero. Magnetic forces F =qv x B exhibit this feature 
of having the force normal to the motion. 

Solution of constrained problems is greatly simplified if the other forces are conservative and the forces 
of constraint are normal to the motion, since then energy conservation can be used. 


2.8 Example: Roller coaster 


Consider motion of a roller coaster shown in the 
adjacent figure. This system is conservative if the fric- 
tion and air drag are neglected and then the forces of 
constraint are normal to the direction of motion. 

The kinetic energy at any position is just given by 
energy conservation and the fact that 


E=T+U 


where U depends on the height of the track at any the 
given location. The kinetic energy is greatest when the 
potential energy is lowest. The forces of constraint 
can be deduced if the velocity of motion on the track 
is known. Assuming that the motion is confined to a 
vertical plane, then one has a centripetal force of con- 
straint 2 normal to the track inwards towards the 
center of the radius of curvature p, plus the gravita- 


tion force downwards of mg. 
2 


The constraint force is Sr — mg upwards at the 

2 
top of the loop, while it is + mg downwards at 
the bottom of the loop. To ensure that the car and 
occupants do not leave the required trajectory, the force 
upwards at the top of the loop has to be positive, that 


is, v2, > pg. The velocity at the bottom of the loop 


is given by ¿mv? = Emu? + 2mgp assuming that the 
track has a constant radius of curvature p. That is; 
at a minimum vz = pg +4pg = 5pg. Therefore the 


occupants now will feel an acceleration downwards of Roller coaster (CCO Public Domain) 


at least ze +g = 6g at the bottom of the loop. The 

first roller coaster was built with such a constant radius of curvature but an acceleration of 6g was too much 
for the average passenger. Therefore roller coasters are designed such that the radius of curvature is much 
larger at the bottom of the loop, as illustrated, in order to maintain sufficiently low g loads and also ensure 
that the required constraint forces exist. 

Note that the minimum velocity at the top of the loop, vr, implies that if the cart starts from rest it must 
start at a height h > § above the top of the loop if friction is negligible. Note that the solution for the rolling 
ball on such a roller coaster differs from that for a sliding object since one must include the rotational energy 
of the ball as well as the linear velocity. 

Looping the loop in a sailplane involves the same physics making it necessary to vary the elevator control 
to vary the radius of curvature throughout the loop to minimize the maximum g load. 
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2.12.5 Velocity Dependent Forces 


Velocity dependent forces are encountered frequently in practical problems. For example, motion of an 
object in a fluid, such as air, where viscous forces retard the motion. In general the retarding force has a 
complicated dependence on velocity. A quadrative-velocity drag force in air often can be expressed in the 
form, 

Fp(v) = —Senpav® (2.108) 
where cp is a dimensionless drag coefficient, p is the density of air, A is the cross sectional area perpendicular 
to the direction of motion, and v is the velocity. Modern automobiles have drag coefficients as low as 0.3. As 
described in chapter 16, the drag coefficient cp depends on the Reynold’s number which relates the inertial to 
viscous drag forces. Small sized objects at low velocity, such as light raindrops, have low Reynold’s numbers 
for which cp is roughly proportional to v—! leading to a linear dependence of the drag force on velocity, i.e. 
Fp(v) x v. Larger objects moving at higher velocities, such as a car or sky-diver, have higher Reynold’s 
numbers for which cp is roughly independent of velocity leading to a drag force Fp(v) x v?. This drag force 
always points in the opposite direction to the unit velocity vector. Approximately for air 


Fp(v) =- (cv + cau”) Vv (2.109) 


where for spherical objects of diameter D, c1 ~ 1.55x 107*D and cz = 0.22D? in MKS units. Fortunately, the 
equation of motion usually can be integrated when the retarding force has a simple power law dependence. 
As an example, consider free fall in the Earth’s gravitational field. 


2.9 Example: Vertical fall in the earth’s gravitational field. 


Linear regime C1 >> CoU 
For small objects at low-velocity, i.e. low Reynold’s number, the drag approximately has a linear depen- 
dence on velocity. Then the equation of motion is 


du 


—MY — CU = ma 


Separate the variables and integrate 


a mdv m,, {mg + cv 
= — = —— |n | mre 
vo TMg C1U C1 mg + C1% 


That is 
m m 
=P, (128 49) er 
C1 C1 
Note that for t > oe the velocity approaches a terminal velocity of Ugo = a The characteristic time 
constant is T= © = ee Note that if vo = 0, then 


UV = Veo (1-e-*) 


For the case of small raindrops with D = 0.5mm, then vœ = 8m/s (18mph) and time constant T = 0.8sec. 
Note that in the absence of air drag, these rain drops falling from 2000m would attain a velocity of over 
400 m.p.h. It is fortunate that the drag reduces the speed of rain drops to non-damaging values. Note that 
the above relation would predict high velocities for hail. Fortunately, the drag increases quadratically at the 
higher velocities attained by large rain drops or hail, and this limits the terminal velocity to moderate values. 
For the United States these velocities still are sufficient to do considerable crop damage in the mid-west. 

Quadratic regime C9U >> Ci 

For larger objects at higher velocities, i.e. high Reynold’s number, the drag depends on the square of the 
velocity making it necessary to differentiate between objects rising and falling. The equation of motion is 


-mg T CU = m- 
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where the positive sign is for falling objects and negative sign for rising objects. Integrating the equation of 


motion for falling gives 
id d 
i= [or (‘anh LO tanh 2) 
vo TMg + Cov Uo Voo 


where T=, ee and Vs = era That is, T = oa For the case of a falling object with vo = 0, solving for 
velocity gives 
t 
UV = Vo tanh — 
z 
As an example, a 0.6kg basket ball with D = 0.25m will have vs. = 20m/s (43 m.p.h.) and T = 2.1sec. 
Consider President George H.W. Bush skydiving. Assume his mass is 70kg and assume an equivalent 
spherical shape of the former President to have a diameter of D = 1m. This gives that Ux. = 56m/s 
(120mph) and T = 5.6sec. When Bush senior opens his 8m diameter parachute his terminal velocity is 
estimated to decrease to 7m/s (15 mph) which is close to the value for a typical (8m) diameter emergency 
parachute which has a measured terminal velocity of 11mph in spite of air leakage through the central vent 
needed to stabilize the parachute motion. 


2.10 Example: Projectile motion in air 


Consider a projectile initially at x = y = 0 at t = 0, that is fired at an initial velocity vo at an angle 
0 to the horizontal. In order to understand the general features of the solution, assume that the drag is 
proportional to velocity. This is incorrect for typical projectile velocities, but simplifies the mathematics. The 
equations of motion can be expressed as 
má = -kmt 
my = —kmy — mg 
where k is the coefficient for air drag. Take the initial conditions at t = 0 to be x = y = 0, & = vo cos 0, 
y = vsin ð. 
Solving in the x coordinate, 
dt ki 
We = £ 
Therefore 
& = vo cos be—* 
That is, the velocity decays to zero with a time constant T = 2. 
Integration of the velocity equation gives 


z= 2 (1 — e™) 


Note that this implies that the body approaches a value of x = 4% as t — oo. 

The trajectory of an object is distorted from the parabolic shape, that occurs for k = 0, due to the rapid 
drop in range as the drag coefficient increases. For realistic cases it is necessary to use a computer to solve 
this numerically. 


2.12.6 Systems with Variable Mass 


Classic examples of systems with variable mass are the rocket, a falling chain, and nuclear fission. Consider 
the problem of vertical rocket motion in a gravitational field using Newtonian mechanics. When there is a 
vertical gravitational external field, the vertical momentum is not conserved due to both gravity and the 
ejection of rocket propellant. In a time dt the rocket ejects propellant dm, vertically with exhaust velocity 
relative to the rocket of u. Thus the momentum imparted to this propellant is 


dpp = —udmy (2.110) 
Therefore the rocket is given an equal and opposite increase in momentum dpr 


dpr = +udmy (2.111) 
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In the time interval dt the net change in the linear momentum of the rocket plus fuel system is given by 
dp = (m — dm,)(v + dv) + dmp (v — u) — mu = mdv — udmy (2.112) 


The rate of change of the linear momentum thus equals 


Foe = = 2.113 
dt "a dt 213) 
Consider the problem for the special case of vertical ascent of the rocket against the external gravitational 

force Fez = —mg. Then 
Mp dv 


d 
-mg +u—=— =m 


= 2.114 
dt dt ( ) 


This can be rewritten as 
—mg + UM = MU 


The second term comes from the variable rocket mass where 
the loss of mass of the rocket equals the mass of the ejected 
propellant. Assuming a constant fuel burn m, = a then 


th = -p = -a (2.115) 


where a > 0. Then the equation becomes 


Q 
dv = (-g + Zu) dt (2.116) 
Since 4 
Mm 
a 2.117 
T a ( ) 
then E Earth 
== di (2.118) 
a 


Figure 2.5: Vertical motion of a rocket in a 


ravitational field 
dv = (Z - =) dm (2119) $ 
a m 
Integration gives 
v=—2 (mo - m) +uln (22) (2.120) 
a m 
But the change in mass is given by 
m t 
f dm = -o | dt (2.121) 
mo 0 
That is 
my — m = at (2.122) 
Thus a 
v=—gttuln (=) (2.123) 


Note that once the propellant is exhausted the rocket will continue to fly upwards as it decelerates in 
the gravitational field. You can easily calculate the maximum height. Note that this formula assumes that 
the acceleration due to gravity is constant whereas for large heights above the Earth it is necessary to use 
the true gravitational force —G Mn where r is the distance from the center of the earth. In real situations 
it is necessary to include air drag which requires a computer to numerically solve the equations of motion. 
The highest rocket velocity is attained by maximizing the exhaust velocity and the ratio of initial to final 
mass. Because the terminal velocity is limited by the mass ratio, engineers construct multistage rockets that 
jettison the spent fuel containers and rockets. The variational-principle approach applied to variable mass 
problems is discussed in chapter 8.7 
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2.12.7 Rigid-body rotation about a body-fixed rotation axis 


The most general case of rigid-body rotation involves rotation about some body-fixed point with the orien- 
tation of the rotation axis undefined. For example, an object spinning in space will rotate about the center 
of mass with the rotation axis having any orientation. Another example is a child's spinning top which spins 
with arbitrary orientation of the axis of rotation about the pointed end which touches the ground about a 
static location. Such rotation about a body-fixed point is complicated and will be discussed in chapter 13. 
Rigid-body rotation is easier to handle if the orientation of the axis of rotation is fixed with respect to the 
rigid body. An example of such motion is a hinged door. 
For a rigid body rotating with angular velocity w, the total angular momentum L is given by 


For rotation equation appendix D29 gives 
v; =w Xr (2.94) 


thus the angular momentum can be written as 
n n 
L=> nx pe= > mrixw xr (2.125) 
i i 
The vector triple product can be simplified using the vector identity equation B.24 giving 


L= da [(mir?) w -— (ri: w)miri| (2.126) 


Rigid-body rotation about a body-fixed symmetry axis 


The simplest case for rigid-body rotation is when the body has a symmetry axis with the angular velocity w 
parallel to this body-fixed symmetry axis. For this case then r; can be taken perpendicular to w, for which 
the second term in equation 2.126, i.e. (r;-w) =0, thus 


n 


Lae = >S (mir?) w (r; perpendicular to w) 


2 
i 


The moment of inertia about the symmetry axis is defined as 
n 
lm =X mir? (2.127) 
i 


where r; is the perpendicular distance from the axis of rotation to the body, m;. For a continuous body the 
moment of inertia can be generalized to an integral over the mass density p of the body 


leS f pr2dV (2.128) 


where r is perpendicular to the rotation axis. The definition of the moment of inertia allows rewriting the 
angular momentum about a symmetry axis Lsym in the form 


Lym = sym (2.129) 


where the moment of inertia Isym is taken about the symmetry axis and assuming that the angular velocity 
of rotation vector is parallel to the symmetry axis. 
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Rigid-body rotation about a non-symmetric body-fixed axis 


In general the fixed axis of rotation is not aligned with a symmetry axis of the body, or the body does not 
have a symmetry axis, both of which complicate the problem. 

For illustration consider that the rigid body comprises a system of n masses m; located at positions rj, 
with the rigid body rotating about the z axis with angular velocity w. That is, 


w=w,2 (2.130) 


In cartesian coordinates the fixed-frame vector for particle ¿ is 


r; = (£i, Yi, Zi) (2.131) 
using these in the cross product (2.94) gives 
Wz Yi 
vi =w Xr = WzTi (2.132) 
0 


which is written as a column vector for clarity. Inserting v; in the cross-product r; x v; gives the components 
of the angular momentum to be 


n n 
L= > Miri X Vi = > MiWz —21Yi 
i i 


That is, the components of the angular momentum are 


De SS (E mza) ty = Dozz (2.133) 
a Pa mo) Wz = lyzwz 


Note that the perpendicular distance from the z axis 
in cylindrical coordinates is p = y£? + y?, thus the an- 
gular momentum L, about the z axis can be written 


as 
Tigi (E mé) wz = 1,2% (2.134) 


where (2.134) gives the elementary formula for the mo- Figure 2.6: A rigid rotating body comprising a sin- 
ment of inertia I,, = Isym about the z axis given earlier gle mass m attached by a massless rod at a fixed 
in (2.129). angle a shown at the instant when m happens to 

The surprising result is that Ly and Ly are non-zero Jie in the yz plane. As the body rotates about 
implying that the total angular momentum vector L is the z— axis the mass m has a velocity and mo- 
in general not parallel with w. This can be understood mentum into the page (the negative x direction). 
by considering the single body m shown in figure 2.6. Therefore the angular momentum L = r x p is in 


When the body is in the y,z plane then x = 0 and the direction shown which is not parallel to the 
L, = 0. Thus the angular momentum vector L has a 


component along the —y direction as shown which is 
not parallel with w and, since the vectors w, L, r, are 
coplanar, then L must sweep around the rotation axis w to remain coplanar with the body as it rotates 
about the z axis. Instantaneously the velocity of the body v; is into the plane of the paper and, since 


Ly 


angular velocity w. 
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L; = mir; x vi, then L; is at an angle (90° — qa) to the z axis. This implies that a torque must be applied 
to rotate the angular momentum vector. This explains why your automobile shakes if the rotation axis and 
symmetry axis are not parallel for one wheel. 

The first two moments in (2.133) are called products of inertia of the body designated by the pair of 
axes involved. Therefore, to avoid confusion, it is necessary to define the diagonal moment, which is called 
the moment of inertia, by two subscripts as I,,. Thus in general, a body can have three moments of inertia 
about the three axes plus three products of inertia. This group of moments comprise the inertia tensor 
which will be discussed further in chapter 13. If a body has an axis of symmetry along the z axis then the 
summations will give I;, = [yz = 0 while J,, will be unchanged. That is, for rotation about a symmetry 
axis the angular momentum and rotation axes are parallel. For any axis along which the angular momentum 
and angular velocity coincide is called a principal axis of the body. 


2.11 Example: Moment of inertia of a thin door 


Consider that the door has width a and height b and assume the door thickness is negligible with areal 
density okg/m2. Assume that the door is hinged about the y axis. The mass of a surface element of 
dimension dx.dy at a distance x from the rotation axis is dm = odxdy, thus the mass of the complete door 
is M = cab. The moment of inertia about the y axis is given by 


a b 1 Fe 1 
I= f J ox*dydx = Zoba? = =Ma? 
x=0 Jy=0 3 3 


2.12 Example: Merry-go-round 


A child of mass m jumps onto the outside edge of a circular merry-go-round of moment of inertia I, and 
radius R and initial angular velocity wo. What is the final angular velocity wp? 

If the initial angular momentum is Lo and, assuming the child jumps with zero angular velocity, then the 
conservation of angular momentum implies that 


Lo = Lẹ 
Iwo = Iw+nw;R 
VO Uf 2 
I= = >(I 
R R | +mR*) 
That is 
Uf wf I 


vo wo ~ T+ mR 


Note that this is true independent of the details of the acceleration of the initially stationary child. 


2.13 Example: Cue pushes a billiard ball 


Consider a billiard ball of mass M and radius R 
is pushed by a cue in a direction that passes through 


the center of gravity such that the ball attains a veloc- S) v, 

ity vo. The friction coefficient between the table and —— 
the ball is y. How far does the ball move before the f 

initial slipping motion changes to pure rolling mo- 

tion? Cue pushing a billiard ball horizontally at the height 


Since the direction of the cue force passes through of the centre of rotation of the ball. 


the center of mass of the ball, it contributes zero 

torque to the ball. Thus the initial angular momen- 

tum is zero att =0. The friction force f points opposite to the direction of motion and causes a torque Ns 
about the center of mass in the direction $. 


N, =f- R =uMgR 
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Since the moment of inertia about the center of a uniform sphere is I = ¿MR? then the angular acceleration 
of the ball is 
q #MIR _ uMgR _5pg (a) 
I 2M R2 2R 


Moreover the frictional force causes a deceleration a, of the linear velocity of the center of mass of 


— as et atid (3) 
Integrating a from time zero to t gives 


í 5 ug 
w f wdt 3R 


The linear velocity of the center of mass at time t is given by integration of equation 6 


t 
Us = J asdt = vo — ugt 
0 
The billiard ball stops sliding and only rolls when vs = wR, that is, when 


5 4g 


tR = vo — ugt 
IR Vo — Ug 
That is, when 
2 VO 
troll =D az 
7 ug 


Thus the ball slips for a distance 


troll 2 2 
7 LIto 12% 

= dt = vot Lot = 
r / is pi 2 49 ug 


Note that if the ball is pushed at a distance h above the center of mass, besides the linear velocity there 
is an initial angular momentum of 


For the special case h = 2R, the ball immediately assumes a pure non-slipping roll. For h < ¿R one has 
w< % while h > ¿R corresponds to w > %. In the latter case the frictional force points forward. 


2.12.8 Time dependent forces 


Many problems involve action in the presence of a time dependent force. There are two extreme cases that 
are often encountered. One case is an impulsive force that acts for a very short time, for example, striking 
a ball with a bat, or the collision of two cars. The second case involves an oscillatory time dependent force. 
The response to impulsive forces is discussed below whereas the response to oscillatory time-dependent forces 
is discussed in chapter 3. 


Translational impulsive forces 


An impulsive force acts for a very short time relative to the response time of the mechanical system being 
discussed. In principle the equation of motion can be solved if the complicated time dependence of the force, 
F(t), is known. However, often it is possible to use the much simpler approach employing the concept of an 
impulse and the principle of the conservation of linear momentum. 

Define the linear impulse P to be the first-order time integral of the time-dependent force. 


P= fra (2.135) 
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Since F(t) = dp then equation 2.135 gives that 


t 
p= [2 TE dt! = = f dp = p(t) — po = Ap (2.136) 

0 
Thus the impulse P is an unambiguous quantity that equals the change in linear momentum of the object 


that has been struck which is independent of the details of the time dependence of the impulsive force. 
Computation of the spatial motion still requires knowledge of F(t) since the 2.136 can be written as 


== fro t')dt' + vo (2.137) 
m 
Integration gives 


t 
r(t) — ro = vot+ | 
0 


m 


r 
E J ra dt” (2.138) 
0 


In general this is complicated. However, for the case of a constant force F(t) = Fo, this simplifies to the 


constant acceleration equation 


1Fo 55 
t)-ro= ~—t 2.1 
r( ) ro = Vot + dm ( 39) 


where the constant acceleration a = Fo. 


Angular impulsive torques 


Note that the principle of impulse also applies to angular motion. Define an impulsive torque T as the 
first-order time integral of the time-dependent torque. 


T= f N(t)dt (2.140) 


Since torque is related to the rate of change of angular momentum 


dL 


O= (2.141) 


then 
ez = E dL = L(t) — Lo = AL (2.142) 
0 
Thus the impulsive torque T equals the change in angular momentum AL of the struck body. 


2.14 Example: Center of percussion of a baseball bat 


When an impulsive force P strikes a bat of mass M at a dis- 
tance s from the center of mass, then both the linear momentum 
of the center of mass, and angular momenta about the center 
of mass, of the bat are changed. Assume that the ball strikes 
the bat with an impulsive force P = Ap?" perpendicular to the 
symmetry axis of the bat at the strike point S which is a distance | 
s from the center of mass of the bat. The translational impulse y 
given to the bat equals the change in linear momentum of the 
ball as given by equation 2.136 coupled with the conservation of | 
linear momentum | 


P = Ap’ = MAv*t 


sS 
Similarly equation 2.142 gives that the angular impulse T equals | 
the change in angular momentum about the center of mass to be m D — 1___. s 


T= s x P = AL =I, AWen 
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The above equations give that 


P 

A bat es Esa 

Vem M 
P 

A bat = sx 
Wem Tom 


Assume that the bat was stationary prior to the strike, then after the strike the net translational velocity 
of a point O along the body-fixed symmetry axis of the bat at a distance y from the center of mass, is given 
by 

víy)=A +A x E p (GRE ey) PL )P-(s- P)y] 
= Av w =— + ——((s = — + -— [(s.- —(s- 
y cm cm X y Mm Tr y Mir sS: y S y 


It is assumed that P and s are perpendicular and thus (s - P) = 0 which simplifies the above equation to 


P 
v (y) = AVem + Awem X y = + (1+ 
M 
Note that the translational velocity of the location O, along the bat symmetry axis at a distance y from the 
center of mass, is zero if the bracket equals zero, that is, if 


where kem is called the radius of gyration of the body about the center of mass. Note that when the scalar 
product s: y = tar = —k?,,, then there will be no translational motion at the point O. This point on the 
y axis lies on the opposite side of the center of mass from the strike point S, and is called the center of 
percussion corresponding to the impulse at the point S. The center of percussion often is referred to as the 
“sweet spot” for an object corresponding to the impulse at the point S. For a baseball bat the batter holds 
the bat at the center of percussion so that they do not feel an impulse in their hands when the ball is struck 
at the point S. This principle is used extensively to design bats for all sports involving striking a ball with 
a bat, such as, cricket, squash, tennis, etc. as well as weapons such of swords and axes used to decapitate 
opponents. 


2.15 Example: Energy transfer in charged-particle scattering 


Consider a particle of charge +e; moving with very high 
velocity vo along a straight line that passes a distance b 
from another charge +e2 and mass m. Find the energy Q y 
transferred to the mass m during the encounter assuming 
the force is given by Coulomb’s law electrostatics. Since the 
charged particle e, moves at very high speed it is assumed 
that charge 2 does not change position during the encounter. 
Assume that charge 1 moves along the —y axis through the 
origin while charge 2 is located on the x axis at x = b. 
Let us consider the impulse given to charge 2 during the 
encounter. By symmetry the y component must cancel while ™® 
the x component is given by 


o9: 


dt 
dp, = F,dt = — ne cos dt = ages cos 0— dé o b te, 
A4negr? Aner? dé : : 
Charged-particle scattering 
But 
ró = —uy cos 0 
where 


? = cos(n — 0) = — cos 
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Thus 
€1€2 


— s0d0 
Areybvy ogag 


dpx as 


Integrate from 5 <0 < a gives that the total momentum imparted to eg is 


3m 
€1€2 2 €1€2 

Pr = === cos 6d0 = ———— 

Arepbvo z 2TEobvo 


Thus the recoil energy of charge 2 is given by 
pe 1 e1€2 3 
Ez = = 
2m 2m \ 2eEgbuo 


2.13 Solution of many-body equations of motion 


The following are general methods used to solve Newton’s many-body equations of motion for practical 
problems. 


2.13.1 Analytic solution 


In practical problems one has to solve a set of equations of motion since the forces depend on the location 
of every body involved. For example one may be dealing with a set of coupled oscillators such as the 
many components that comprise the suspension system of an automobile. Often the coupled equations of 
motion comprise a set of coupled second-order differential equations. The first approach to solve such a 
system is to try an analytic solution comprising a general solution of the inhomogeneous equation plus one 
particular solution of the inhomogeneous equation. Another approach is to employ numeric integration using 
a computer. 


2.13.2 Successive approximation 


When the system of coupled differential equations of motion is too complicated to solve analytically, one 
can use the method of successive approximation. The differential equations are transformed to integral 
equations. Then one starts with some initial conditions to make a first order estimate of the functions. The 
functions determined by this first order estimate then are used in a second iteration and this is repeated 
until the solution converges. An example of this approach is when making Hartree-Foch calculations of the 
electron distributions in an atom. The first order calculation uses the electron distributions predicted by 
the one-electron model of the atom. This result then is used to compute the influence of the electron charge 
distribution around the nucleus on the charge distribution of the atom for a second iteration etc. 


2.13.3 Perturbation method 


The perturbation technique can be applied if the force separates into two parts F = Fi + F> where F, >> Fa 
and the solution is known for the dominant Fı part of the force. Then the correction to this solution due 
to addition of the perturbation F> usually is easier to evaluate. As an example, consider that one of the 
Space Shuttle thrusters fires. In principle one has all the gravitational forces acting plus the thrust force 
of the thruster. The perturbation approach is to assume that the trajectory of the Space Shuttle in the 
earth’s gravitational field is known. Then the perturbation to this motion due to the very small thrust, 
produced by the thruster, is evaluated as a small correction to the motion in the Earth’s gravitational field. 
This perturbation technique is used extensively in physics, especially in quantum physics. An example 
from my own research is scattering of a 1GeV 208Pb ion in the Coulomb field of a *% Au nucleus. The 
trajectory for elastic scattering is simple to calculate since neither nucleus is excited and the total energy and 
momenta are conserved. However, usually one of these nuclei will be internally excited by the electromagnetic 
interaction. This is called Coulomb excitation. The effect of the Coulomb excitation usually can be treated as 
a perturbation by assuming that the trajectory is given by the elastic scattering solution and then calculate 
the excitation probability assuming the Coulomb excitation of the nucleus is a small perturbation to the 
trajectory. 
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2.14 Newton’s Law of Gravitation 


Gravitation plays a fundamental role in classical mechan- 
ics as well as being an important example of a conservative 
central Ej force. Although you may not be familiar with 
use of vector calculus for the gravitational field g, it is as- 
sumed that you have met the identical approach for studies 
of the electric field E in electrostatics. The primary dif- 
ference is that mass m replaces charge e, and gravitational 
field g replaces the electric field E. This chapter reviews the 
concepts of vector calculus as used for study of conservative 
inverse-square law central fields. 

In 1666 Newton formulated the Theory of Gravitation 
which he eventually published in the Principia in 1687. New- 
ton’s Law of Gravitation states that each mass particle at- 
tracts every other particle in the universe with a force that 
varies directly as the product of the mass and inversely as 
the square of the distance between them. That is, the force 
on a gravitational point mass mg produced by a mass Me Figure 2.7: Gravitational force on mass m due 
maMce.. to an infinitessimal volume element of the mass 
a E (2.143) density distribution. 


p(r’)dx'dy'dz’ rr’ m 


Fm = -G 
En 
where T is the unit vector pointing from the gravitational 
mass Mg to the gravitational mass mg as shown in figure 2.7. Note that the force is attractive, that 
is, it points toward the other mass. This is in contrast to the repulsive electrostatic force between two 
similar charges. Newton's law was verified by Cavendish using a torsion balance. The experimental value of 
G = (6.6726 + 0.0008) x 1071 N . m?/kg?. 

The gravitational force between point particles can be extended to finite-sized bodies using the fact that 
the gravitational force field satisfies the superposition principle, that is, the net force is the vector sum of the 
individual forces between the component point particles. Thus the force summed over the mass distribution 
is 


F(r),, = -Gma Y ai (2.144) 


where r; is the vector from the gravitational mass mg; to the gravitational mass mg at the position r. 
For a continuous gravitational mass distribution pa (r’), the net force on the gravitational mass mg at 
the location r can be written as 


pa (Y) (F- 7) : 


En (r) = -Gme | w' (2.145) 


v (Fry 


where dv’ is the volume element at the point r’ as illustrated in figure 2.7. 
g 


2.14.1 Gravitational and inertial mass 
Newton’s Laws use the concept of inertial mass mr = m in relating the force F to acceleration a 
F=mya (2.146) 


and momentum p to velocity v 
p=myv (2.147) 


That is, inertial mass is the constant of proportionality relating the acceleration to the applied force. 
The concept of gravitational mass mg is the constant of proportionality between the gravitational force 
and the amount of matter. That is, on the surface of the earth, the gravitational force is assumed to be 


Ts 


Fo = mga oy anes J = meg (2.148) 
i=1 ? 
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where g is the gravitational field which is a position-dependent force per unit gravitational mass pointing 
towards the center of the Earth. The gravitational mass is measured when an object is weighed. 

Newton’s Law of Gravitation leads to the relation for the gravitational field g(r) at the location r due 
to a gravitational mass distribution at the location r’ as given by the integral over the gravitational mass 


density Pa 
pa (e) (EP) 
g(r) = -c f — d" (2.149) 
v (T-T) 
The acceleration of matter in a gravitational field relates the gravitational and inertial masses 
FG = mag = mra (2.150) 
Thus 
de E (2.151) 
mr 


That is, the acceleration of a body depends on the gravitational strength g and the ratio of the gravitational 
and inertial masses. It has been shown experimentally that all matter is subject to the same acceleration 
in vacuum at a given location in a gravitational field. That is, = is a constant common to all materials. 
Galileo first showed this when he dropped objects from the Tower of Pisa. Modern experiments have shown 
that this is true to 5 parts in 1019. 

The exact equivalence of gravitational mass and inertial mass is called the weak principle of equiva- 
lence which underlies the General Theory of Relativity as discussed in chapter 17. It is convenient to use 
the same unit for the gravitational and inertial masses and thus they both can be written in terms of the 
common mass symbol m. 

mr = Ma =m (2.152) 

Therefore the subscripts G and I can be omitted in equations 2.150 and 2.152. Also the local acceleration 
due to gravity a can be written as 

a=g (2.153) 


The gravitational field g = E has units of N/kg in the MKS system while the acceleration a has units m/s?. 


2.14.2 Gravitational potential energy U 


Chapter 2.10.2 showed that a conservative field can be expressed in 
terms of the concept of a potential energy U(r) which depends on 
position. The potential energy difference AU,_,, between two points 
Ta and rp, is the work done moving from a to b against a force F. That 
is: 


Th 
AUa— = U (ra) — U (ra) = -f F. dl (2.154) 

In general, this line integral depends on the path taken. mg 

Consider the gravitational field produced by a single point mass 
mı. The work done moving a mass mo from ra to rẹ in this gravita- 
tional field can be calculated along an arbitrary path shown in figure 
2.8 by assuming Newton’s law of gravitation. Then the force on my  * 
due to point mass my is; 
MIMOA 
—; r 


F=-G (2.155) 


Figure 2.8: Work done against a 
force field moving from a to b. 


T 


Expressing dl in spherical coordinates dl =dré+rd00-+r sin ddd gives 
the path integral (2.154) from (raaa) to (roOod,) is 


b b b 
AUas = -f ra= f [CUE Ear +t- bdo + rsinot-dds)| =G | TNS. fdr 


ee ee E E =| (2.156) 


b Ta 
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since the scalar product of the unit vectors T-T = 1. Note that the second two terms also cancel since 
T.O=f. o= = 0 since the unit vectors are mutually orthogonal. Thus the line integral just depends only on 
the starting and ending radii and is independent of the angular coordinates or the detailed path taken between 
(Tabada) and (ToA9,) - 

Consider the Principle of Superposition for a gravitational field produced by a set of n point masses. The 
line integral then can be written as: 


To n 
Aunt, = -f Ena --5 fF «dl =Y AUi_, (2.157) 
Ta i=1 
Thus the net potential energy difference is the sum of the contributions from each point mass producing the 
gravitational force field. Since each component is conservative, then the total potential energy difference also 
must be conservative. For a conservative force, this line integral is independent of the path taken, it depends 
only on the starting and ending positions, rg and ry. That is, the potential energy is a local function 
dependent only on position. The usefulness of gravitational potential energy is that, since the gravitational 
force is a conservative force, it is possible to solve many problems in classical mechanics using the fact 
that the sum of the kinetic energy and potential energy is a constant. Note that the gravitational field is 
conservative, since the potential energy difference AU”, is independent of the path taken. It is conservative 
because the force is radial and time independent, it is not due to the + dependence of the field. 


2.14.3 Gravitational potential $ 


Using F = mog gives that the change in potential energy due to moving a mass mo from a to b in a 
gravitational field g is: 


Tb 
AUre, = -mo ‘| Enet - dl (2.158) 


Note that the probe mass my factors out from the integral. It is convenient to define a new quantity called 
gravitational potential @ where 


A net Th 
Apae = Ai =- / Bnet dl (2.159) 
0 Te 


That is; gravitational potential difference is the work that must be done, per unit mass, to move from a to 
b with no change in kinetic energy. Be careful not to confuse the gravitational potential energy difference 
AU.» and gravitational potential difference Ad that is, AU has units of energy, Joules, while Af has 
units of Joules/kg. 

The gravitational potential is a property of the gravitational force field; it is given as minus the line 
integral of the gravitational field from a to b. The change in gravitational potential energy for moving a 
mass mo from a to bis given in terms of gravitational potential by: 


a—b> 


a—b 


AUP, = mAg? (2.160) 


Superposition and potential 


Previously it was shown that the gravitational force is conservative for the superposition of many masses. 
To recap, if the gravitational field 
Bnet = 81 +82 +83 (2.161) 


Tb Tb Tb Tb 
pre, =- I E ace f aia f Enya f g3-dl==2¢i_, (2.162) 


Thus gravitational potential is a simple additive scalar field because the Principle of Superposition applies. 
The gravitational potential, between two points differing by h in height, is gh. Clearly, the greater g or h, 
the greater the energy released by the gravitational field when dropping a body through the height h. The 
unit of gravitational potential is the Joule 


then 
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2.14.4 Potential theory 


The gravitational force and electrostatic force both obey the inverse square law, for which the field and 
corresponding potential are related by: 


To 
Ada = -f g-dl (2.163) 
For an arbitrary infinitessimal element distance dl the change in gravitational potential de is 


dọ = -g -dl (2.164) 


Using cartesian coordinates both g and dl can be written as 


g = igs + jg, + kg- dl = idx + jdy + kdz (2.165) 


Taking the scalar product gives: 


dọ = —g- dl = —grdx — gydy — gzdz (2.166) 

Differential calculus expresses the change in potential dọ in terms of partial derivatives by: 

09 do do 
dé = —d d d 2.167 
? dr T Oy a az” ( ) 
By association, 2.166 and 2.167 imply that 
09 Og 09 
pee LE, Y === 2.1 

s Ox Iy Oy J Oz (2.168) 


Thus on each axis, the gravitational field can be written as minus the gradient of the gravitational potential. 
In three dimensions, the gravitational field is minus the total gradient of potential and the gradient of the 
scalar function ¢ can be written as: 


g=-Vo (2.169) 


In cartesian coordinates this equals 


09 
Oz 


Thus the gravitational field is just the gradient of the gravitational potential, which always is perpendicular 
to the equipotentials. Skiers are familiar with the concept of gravitational equipotentials and the fact that 
the line of steepest descent, and thus maximum acceleration, is perpendicular to gravitational equipotentials 
of constant height. The advantage of using potential theory for inverse-square law forces is that scalar 
potentials replace the more complicated vector forces, which greatly simplifies calculation. Potential theory 
plays a crucial role for handling both gravitational and electrostatic forces. 


_ hop 706) = 
g=— [5 tig +k (2.170) 


2.14.5 Curl of the gravitational field 


It has been shown that the gravitational field is conservative, that is 
AUa—» is independent of the path taken between a and b. Therefore, 
equation 2.159 gives that the gravitational potential is independent of f! b 
the path taken between two points a and b. Consider two possible paths 
between a and b as shown in figure 2.9. The line integral from a to b via 
route 1 is equal and opposite to the line integral back from b to a via 2 
route 2 if the gravitational field is conservative as shown earlier. 
A better way of expressing this is that the line integral of the gravita- 
tional field is zero around any closed path. Thus the line integral between 
a and b, via path 1, and returning back to a, via path 2, are equal and Figure 2.9: Circulation of the 
opposite. That is, the net line integral for a closed loop is zero gravitational field. 
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Hna -dl=0 (2.171) 


which is a measure of the circulation of the gravitational field. The fact that the circulation equals zero 
corresponds to the statement that the gravitational field is radial for a point mass. 
Stokes Theorem, discussed in appendix H3, states that 


$ Pas f srg (V x F)-dS (2.172) 
C 


ounded 
by 
C 


Thus the zero circulation of the gravitational field can be rewritten as 


da fares (V xg)-dS=0 (2.173) 
C 


ounded 
by 
C 


Since this is independent of the shape of the perimeter C, therefore 
Vxg=0 (2.174) 


That is, the gravitational field is a curl-free field. 
A property of any curl-free field is that it can be expressed as the gradient of a scalar potential ¢ since 


Vx V¢=0 (2.175) 
Therefore, the curl-free gravitational field can be related to a scalar potential ¢ as 
g=-Vó (2.176) 


Thus ¢ is consistent with the above definition of gravitational potential ¢ in that the scalar product 


año == fer d= f (we: a= f E a= fap (2.177) 


An identical relation between the electric field and electric potential applies for the inverse-square law 
electrostatic field. 


Reference potentials: 


Note that only differences in potential energy, U, and gravitational potential, ¢, are meaningful, the absolute 
values depend on some arbitrarily chosen reference. However, often it is useful to measure gravitational 
potential with respect to a particular arbitrarily chosen reference point fp, such as to sea level. Aircraft 
pilots are required to set their altimeters to read with respect to sea level rather than their departure 
airport. This ensures that aircraft leaving from say both Rochester, 559 msl, and Denver 5000 msl, have 
their altimeters set to a common reference to ensure that they do not collide. The gravitational force is the 
gradient of the gravitational field which only depends on differences in potential, and thus is independent of 
any constant reference. 


Gravitational potential due to continuous distributions of charge Suppose mass is distributed 
over a volume v with a density p at any point within the volume. The gravitational potential at any field 
point p due to an element of mass dm = pdv at the point p’ is given by: 


p(p')dv" 
Abp = -c | aa (2.178) 
v Tp'p 
This integral is over a scalar quantity. Since gravitational potential ¢ is a scalar quantity, it is easier to 


compute than is the vector gravitational field g . If the scalar potential field is known, then the gravitational 
field is derived by taking the gradient of the gravitational potential. 
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2.14.6 Gauss’s Law for Gravitation 


The flux Ẹ of the gravitational field g through a surface 
S, as shown in figure 2.10, is defined as 


d= | g-ds (2.179) 
S 


Note that there are two possible perpendicular directions 
that could be chosen for the surface vector dS. Using 
Newton’s law of gravitation for a point mass m the flux 
through the surface S is 


o= -Gm | ZS 
S 


r 


(2.180) 


Note that the solid angle subtended by the surface dS 
at an angle 0 to the normal from the point mass is given 
by 


_ cosódS T-dS 


dQ 72 -z (2.181) 
Thus the net gravitational flux equals Figure 2.10: Flux of the gravitational field through 
an infinitessimal surface element dS. 
P= -Gm | dQ (2.182) 
S 


Consider a closed surface where the direction of the surface vector dS is defined as outwards. The net 
flux out of this closed surface is given by 


r 


d= -Cm f = > = -Gm $ dQ = —Gm4n (2.183) 
S S 


This is independent of where the point mass lies within the closed surface or on the shape of the closed 
surface. Note that the solid angle subtended is zero if the point mass lies outside the closed surface. Thus 
the flux is as given by equation 2.183 if the mass is enclosed by the closed surface, while it is zero if the mass 
is outside of the closed surface. 

Since the flux for a point mass is independent of the location of the mass within the volume enclosed by 
the closed surface, and using the principle of superposition for the gravitational field, then for n enclosed 
point masses the net flux is 


b= / g: dS =—4nGS mi (2.184) 
S i 
This can be extended to continuous mass distributions, with local mass density p, giving that the net flux 
P= i g- dS = —4rG , {Pt (2.185) 
a ‘volume 


Gauss’s Divergence Theorem was given in appendix H2 as 


o= $ F-as= f V -Fdv (2.186) 
S Enclosed 


volume 


Applying the Divergence Theorem to Gauss’s law gives that 


o= fg as= | V -gdv = —4rG pdv 
S Enclosed enclosed 


volum volume 
or 
V. 4rGp| dv = 0 2.1 
(a | 8 oat pl v ( e 


volume 
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This is true independent of the shape of the surface, thus the divergence of the gravitational field 
V -g = —4rGp (2.188) 


This is a statement that the gravitational field of a point mass has a + dependence. 
Using the fact that the gravitational field is conservative, this can be expressed as the gradient of the 
gravitational potential ¢, 


g=-Vó (2.189) 
and Gauss's law, then becomes 
V -Vo = 4rGp (2.190) 
which also can be written as Poisson’s equation 
V°o = 4nGp (2.191) 


Knowing the mass distribution p allows determination of the potential by solving Poisson’s equation. 
A special case that often is encountered is when the mass distribution is zero in a given region. Then the 
potential for this region can be determined by solving Laplace’s equation with known boundary conditions. 


V7¢=0 (2.192) 


For example, Laplace’s equation applies in the free space between the masses. It is used extensively in elec- 
trostatics to compute the electric potential between charged conductors which themselves are equipotentials. 


2.14.7 Condensed forms of Newton’s Law of Gravitation 


The above discussion has resulted in several alternative expressions of Newton’s Law of Gravitation that will 
be summarized here. The most direct statement of Newton’s law is 


G =F E) 
= y=-o [E — A (2.193) 
(r— r’) 


An elegant way to express Newton’s Law of Gravitation is in terms of the flux and circulation of the 
gravitational field. That is, 
Flux: 


enclosed 
volume 


P= f g- dS = —4rG pdv (2.194) 
S 


Circulation: 
f sow -dl=0 (2.195) 


The flux and circulation are better expressed in terms of the vector differential concepts of divergence 
and curl. 
Divergence: 
V -g = —4rGp (2.196) 


Curl: 
Vxg=0 (2.197) 


Remember that the flux and divergence of the gravitational field are statements that the field between 
point masses has a + dependence. The circulation and curl are statements that the field between point 
masses is radial. 

Because the gravitational field is conservative it is possible to use the concept of the scalar potential 
field ¢. This concept is especially useful for solving some problems since the gravitational potential can be 
evaluated using the scalar integral 


/ dv’ 
i = -6 f a (2.198) 
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An alternate approach is to solve Poisson’s equation if the boundary values and mass distributions are known 
where Poisson’s equation is: 


V°¢ = 4nGp (2.199) 


These alternate expressions of Newton’s law of gravitation can be exploited to solve problems. The 
method of solution is identical to that used in electrostatics. 


2.16 Example: Field of a uniform sphere 


Consider the simple case of the gravitational field due to a uniform sphere of matter of radius R and 
mass M. Then the volume mass density 
3M 
4r R3 
The gravitational field and potential for this uniform sphere of matter can be derived three ways; 
a) The field can be evaluated by directly integrating over the volume 


p= 


b) The potential can be evaluated directly by integration of 


Bd 1 
Ae. -G AALE 
PEES = p 


and then 
g=-Vọ 


c) The obvious spherical symmetry can be used in conjunction 
with Gauss’s law to easily solve this problem. 


f g- dS =—47G pdv 
s enclosed 
Arrr?g (r) = —4nGM (r>R) 


That is: for r > R 


M ò -GM | 3R?-r? | 
AR 2R? GM 
g=-G aT (r>R) > 
Similarly, for r < R 
Gravitational field g and gravitational 
AT = tential ® of a uniformly-dense 
cae) 3 p E spherical mass distribution of radius R. 


That is: ij 
g — Gr (r<R) 


The field inside the Earth is radial and is proportional to the distance from the center of the Earth. This 
is Hooke's Law, and thus ignoring air drag, any body dropped down a ge ig the center of the Earth 
will undergo harmonic oscillations with an angular frequency of wo = 4/ S ay = = y 2. This gives a period of 
oscillation of 1.4 hours, which is about the length of a P235 lecture in classical mechanics, which may seem 
like a long time. 

Clearly method (c) is much simpler to solve for this case. In general, look for a symmetry that allows 
identification of a surface upon which the magnitude and direction of the field is constant. For such cases 
use Gauss's law. Otherwise use methods (a) or (b) whichever one is easiest to apply. Further examples will 
not be given here since they are essentially identical to those discussed extensively in electrostatics. 
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2.15 Summary 


Newton’s Laws of Motion: 
A cursory review of Newtonian mechanics has been presented. The concept of inertial frames of reference 
was introduced since Newton’s laws of motion apply only to inertial frames of reference. 
Newton’s Law of motion 
dp 
F => 
dt 
leads to second-order equations of motion which can be difficult to handle for many-body systems. 
Solution of Newton's second-order equations of motion can be simplified using the three first-order in- 
tegrals coupled with corresponding conservation laws. The first-order time integral for linear momentum 
1s 


(2.6) 


2 2 
dp; 
i F jdt = f Pi dt = (po — pı); (2.10) 
The first-order time integral for angular momentum is 
dL; dp; 2 ? dL; 
=r; x — =N; N;dt = dt = (Lə — L; ); 2.16 
Li ® f f Ger Goby, (2.16) 


The first-order spatial integral is related to kinetic energy and the concept of work. That is 


2 
r; 1 

The conditions that lead to conservation of linear and angular momentum and total mechanical energy 
were discussed for many-body systems. The important class of conservative forces was shown to apply if 
the position-dependent force do not depend on time or velocity, and if the work done by a force i F; - dr; 
is independent of the path taken between the initial and final locations. The total mechanical energy is a 
constant of motion when the forces are conservative. 

It was shown that the concept of center of mass of a many-body or finite sized body separates naturally 
for all three first-order integrals. The center of mass is that point about which 


5 mir; = Jroa =0. (Centre of mass definition) 


where r; is the vector defining the location of mass m; with respect to the center of mass. The concept of 
center of mass greatly simplifies the description of the motion of finite-sized bodies and many-body systems 
by separating out the important internal interactions and corresponding underlying physics, from the trivial 
overall translational motion of a many-body system.. 

The Virial theorem states that the time-averaged properties are related by 


(T) = -1 (E F;- 5) (2.86) 


It was shown that the Virial theorem is useful for relating the time-averaged kinetic and potential energies, 
especially for cases involving either linear or inverse-square forces. 

Typical examples were presented of application of Newton's equations of motion to solving systems 
involving constant, linear, position-dependent, velocity-dependent, and time-dependent forces, to constrained 
and unconstrained systems, as well as systems with variable mass. Rigid-body rotation about a body-fixed 
rotation axis also was discussed. 

It is important to be cognizant of the following limitations that apply to Newton's laws of motion: 

1) Newtonian mechanics assumes that all observables are measured to unlimited precision, that is t, E, 
p,r are known exactly. Quantum physics introduces limits to measurement due to wave-particle duality. 

2) The Newtonian view is that time and position are absolute concepts. The Theory of Relativity shows 
that this is not true. Fortunately for most problems v << c and thus Newtonian mechanics is an excellent 
approximation. 
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3) Another limitation, to be discussed later, is that it is impractical to solve the equations of motion for 
many interacting bodies such as all the molecules in a gas. Then it is necessary to resort to using statistical 
averages, this approach is called statistical mechanics. 

Newton’s work constitutes a theory of motion in the universe that introduces the concept of causality. 
Causality is that there is a one-to-one correspondence between cause of effect. Each force causes a known 
effect that can be calculated. Thus the causal universe is pictured by philosophers to be a giant machine 
whose parts move like clockwork in a predictable and predetermined way according to the laws of nature. This 
is a deterministic view of nature. There are philosophical problems in that such a deterministic viewpoint 
appears to be contrary to free will. That is, taken to the extreme it implies that you were predestined to 
read this book because it is a natural consequence of this mechanical universe! 


Newton’s Laws of Gravitation 

Newton’s Laws of Gravitation and the Laws of Electrostatics are essentially identical since they both 
involve a central inverse square-law dependence of the forces. The important difference is that the gravi- 
tational force is attractive whereas the electrostatic force between identical charges is repulsive. That is, 
the gravitational constant G is replaced by ee and the mass density p becomes the charge density for 
the case of electrostatics. As a consequence it is unnecessary to make a detailed study of Newton’s law of 
gravitation since it is identical to what has already been studied in your accompanying electrostatic courses. 
Table 2.1 summarizes and compares the laws of gravitation and electrostatics. For both gravitation and 
electrostatics the field is central and conservative and depends as 4f. 

The laws of gravitation and electrostatics can be expressed in a more useful form in terms of the flux and 
circulation of the gravitational field as given either in the vector integral or vector differential forms. The 
radial independence of the flux, and corresponding divergence, is a statement that the fields are radial and 
have a at dependence. The statement that the circulation, and corresponding curl, are zero is a statement 
that the fields are radial and conservative. 


Table 2.1; Comparison of Newton’s law of gravitation and electrostatics. 


Gravitation Electrostatics 


Force field g= 2 E= KA 
Density Mass ary p(r’) Charge density p(r’) 


Conservative central field | g(r) = -G ADE a! E (T) = + AE de! 
V (r-r') 4neo JV_ (r—r’) 


Fl au dS = —4rG fenctosea pd = |E. dS = enclosed pa 
ux e T Jenclosea pdv Ís 5S l apdo | 


Circulation fere -dl=0 F Enes -dl=0 


Divergence V.g= ón V.E= =p 
Curl VxE=0 
Potential Ay 


[Pons uation [V6 = arp Aee 


para 
œ0—>p = 6 ia 


Both the gravitational and electrostatic central fields are conservative making it possible to use the 
concept of the scalar potential field ¢. This concept is especially useful for solving some problems since the 
potential can be evaluated using a scalar integral. An alternate approach is to solve Poisson’s equation if the 
boundary values and mass distributions are known. The methods of solution of Newton’s law of gravitation 
are identical to those used in electrostatics and are readily accessible in the literature. 
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Workshop exercises 


1. Spend a few minutes looking over the following problems, paying particular attention to the problems that 
you think you might have trouble with. All of the problems are taken from an introductory physics course on 
mechanics, so this should seem like review material. After you have had some time to look over the problems, 
you will take turns stepping up to the board to solve one. When it is your turn, you may pick ANY of the 
problems that have not already been solved. Depending on the number of students in the recitation, you may 
be asked to solve more than one problem. Good luck! 


(a) Justin fires a 12-gram bullet into a block of wood. The bullet travels at 190 m/s, penetrates the 2.0-kg 
block of wood, and emerges going 150 m/s. If the block is stationary on a frictionless surface when hit, 
how fast does it move after the bullet emerges? 


(b) A mass m at the end of a spring vibrates with a frequency of 0.88 Hz; when an additional 1.25 kg mass 
is added to m, the frequency is 0.48 Hz. What is the value of m? 


(c) Dan has a new chandelier in his living room. The chandelier is 27-kg and it hangs from the ceiling on a 
vertical 4.0-m-long wire. What horizontal force would Dan need to use to displace its position 0.10 m to 
one side? What will be the tension in the wire? 


(d) Dianne has a new spring with a spring constant of 900 N/m that she bought at Springs-R-Us. She places 
it vertically on a table and compresses it by 0.150 m. What upward speed can it give to a 0.300-kg ball 
when released? 


(e) A tiger leaps horizontally from a 6.5-m-high rock with a speed of 4.0 m/s. How far from the base of the 
rock will she land? 


(£) How much work must SuperRyan do to stop a 1300-kg car traveling at 100 km/hr? 


(g) Jason catches a baseball 3.1 s after throwing it vertically upward. With what speed did he throw it and 
what height did it reach? 


(h) Laura is practicing her figure skating and during her finale she can increase her rotation rate from an 
initial rate of 1.0 rev every 2.0 s to a final rate of 3.0 rev/s. If her initial moment of inertia was 4.6 kg-m?, 
what is her final moment of inertia? 


(i) On an icy day in Rochester (imagine that!), you worry about parking your car in your driveway, which 
has an incline of 12°. Your neighbor Emily’s driveway has an incline of 9°, and Brian’s driveway across 
the street has one of 6°. The coefficient of static friction between tire rubber and ice is 0.15. Which 
driveway(s) will be safe to park a car? 


2. Two particles are projected from the same point with velocities vı and va, at elevations a and (2, respectively 
(a1 > Q2). Show that if they are to collide in mid-air the interval between the firings must be 


2v1 V2 sin(a — a2) 
g(v1 cosa] + va cos œ2) 


(If you don't have time to solve this problem completely, then at least give an outline of how you would go 
about solving the problem.) 


3. Read each of the following statements and, without consulting anyone else, mark them true or false. If you are 
unsure of any of them, make a guess. Once everyone has answered each of the statements individually, break 
into small groups and compare your answers. Try to come to an agreement as a group. The Teaching Assistant 
will then make sure everyone has the correct answer. Good luck! 


(a) The conservation of linear momentum is a consequence of translational symmetry, or the homogeneity of 
space. 


(b) For an isolated system with no external forces acting on it, the angular momentum will remain constant 
in both magnitude and direction. 


(c) A reference frame is called an inertial frame if Newton's laws are valid in that frame. 


(d) Newtonian mechanics and the laws of electromagnetism are invariant under Galilean transformations. 
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(e) The law of conservation of angular momentum is a consequence of rotational symmetry, or the isotropy 
of space. 


(f) The center of mass of a system of particles moves like a single particle of mass M (total mass of the 
system) acted on by a single force F' that is equal to the sum of all the external forces acting on the 
system. 


(g) If Newton’s laws are valid in one reference frame, then they are also valid in any reference frame accelerated 
with respect to the first system. 


(h) The law of conservation of energy is a consequence of inversion symmetry, or the invertibility of space. 


4. The teeter totter comprises two identical weights which hang on drooping arms attached to a peg as shown. 
The arrangement is unexpectedly stable and can be spun and rocked with little danger of toppling over. 


(a) Find an expression for the potential energy of the teeter toy as a function of Ó when the teeter toy is 
cocked at an angle 0 about the pivot point. For simplicity, consider only rocking motion in the vertical 
plane. 


Determine the equilibrium values(s) of 6. 
Determine whether the equilibrium is stable, unstable, or neutral for the value(s) of 0 found in part (b). 
How could you determine the answers to parts (b) and (c) from a graph of the potential energy versus 0? 


Expand the expression for the potential energy about O = O and determine the frequency of small 
oscillations. 


5. For each of the situations described below, determine which of the four functional forms of the force is most 
appropriate. Consider motion only along one dimension. 


e Constant force: F = constant 

e Time-dependent force: F = F(t) 

e Velocity-dependent force: F = F(v) 
e Distance-dependent force: F = F(x) 


Go around the room and take turns answering a question. When it is your turn, pick a functional form and 
explain why you chose the one you did. If you are unsure, make a guess or ask a question to get help from the 
rest of the workshop. There may be more than one answer depending on your interpretation of the situation, 
so be sure to explore all of the possibilities. 


(a) A mass resting on a frictionless table is attached to a spring, which in turn is attached to a wall. The 
mass is pulled to the side and executes simple harmonic motion in the horizontal direction. 


(b) A freely-falling body subject to a constant gravitational field with no air resistance. 


(c) An electron, initially at rest (treat it classically!), encounters an incoming electromagnetic wave of electric 
field intensity E given by E = Ep sin(wt + @). 


(d) A large mass is affected by the gravitational field of another mass a distance d away. 
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(e) A freely-falling body subject to a constant gravitational field with air resistance. 


(£) A charged point particle is affected by the presence of another charged point particle a distance d away. 
6. A particle of mass m is constrained to move on the frictionless inner surface of a cone of half-angle a. 


(a) Find the restrictions on the initial conditions such that the particle moves in a circular orbit about the 
vertical axis. 


(b) Determine whether this kind of orbit is stable. A particle of mass m is constrained to move on the 
frictionless inner surface of a cone of half-angle a, as shown in the figure. 


7. Consider a thin rod of length L and mass M. 


(a) Draw gravitational field lines and equipotential lines for the rod. What can you say about the equipotential 
surfaces of the rod? 


(b) Calculate the gravitational potential at a point P that is a distance r from one end of the rod and in a 
direction perpendicular to the rod. 


(c) Calculate the gravitational field at P by direct integration. 
(d) Could you have used Gauss's law to find the gravitational field at P? Why or why not? 


8. Consider a single particle of mass M. 


(a) 
(b) 
(c) 
(d) Show that the angular momentum L = r x p of the particle is conserved. Hint: (A x B) = 
Ax 98 4 S& x B. 


Determine the position r and velocity v of a particle in spherical coordinates. 
Determine the total mechanical energy of the particle in potential V. 


Assume the force is conservative. Show that F = —VV. Show that it agrees with Stoke’s theorem. 


9. Consider a fluid with density p and velocity v in some volume V. The mass current J = pv determines the 
amount of mass exiting the surface per unit time by the integral Ss J- dA. 


(a) Using the divergence theorem, prove the continuity equation, V- J+ oe =0 


10. A rocket of initial mass M burns fuel at constant rate k (kilograms per second), producing a constant force f. 
The total mass of available fuel is mo. Assume the rocket starts from rest and moves in a fixed direction with 
no external forces acting on it. 


(a) Determine the equation of motion of the rocket. 
(b) Determine the final velocity of the rocket. 


(c) Determine the displacement of the rocket in time. 
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Problems 


1. Consider a solid hemisphere of radius a. Compute the coordinates of the center of mass relative to the center 
of the spherical surface used to define the hemisphere. 


2. A 2000kg Ford was travelling south on Mt. Hope Avenue when it collided with your 1000kg sports car travelling 
west on Elmwood Avenue. The two badly-damaged cars became entangled in the collision and leave a skid mark 
that is 20 meters long in a direction 14° to the west of the original direction of travel of the Excursion. The 
wealthy Excursion driver hires a high-powered lawyer who accuses you of speeding through the intersection. 
Use your P235 knowledge, plus the police officer’s report of the recoil direction, the skid length, and knowledge 
that the coefficient of sliding friction between the tires and road is u = 0.6, to deduce the original velocities of 
both cars. Were either of the cars exceeding the 30mph speed limit? 


3. A particle of mass m moving in one dimension has potential energy U(x) = Uo[2(£)? — (2)*], where Uo and a 
are positive constants. 


a) Find the force F(x) that acts on the particle. 

b) Sketch U(x). Find the positions of stable and unstable equilibrium. 

c) What is the angular frequency w of oscillations about the point of stable equilibrium? 
d) What is the minimum speed the particle must have at the origin to escape to infinity? 


e) At t = 0 the particle is at the origin and its velocity is positive and equal to the escape velocity. Find x(t) 
and sketch the result. 


4. a) Consider a single-stage rocket travelling in a straight line subject to an external force F*** acting along the 
same line where ve, is the exhaust velocity of the ejected fuel relative to the rocket. Show that the equation of 
motion is 

MÒ = —Mver + F 


b) Specialize to the case of a rocket taking off vertically from rest in a uniform gravitational field g. Assume 
that the rocket ejects mass at a constant rate of m = —k where k is a positive constant. Solve the equation of 
motion to derive the dependence of velocity on time. 


c) The first couple of minutes of the launch of the Space Shuttle can be described roughly by; initial mass 
= 2 x 10° kg, mass after 2 minutes = 1 x 10% kg, exhaust speed ver = 3000m/s, and initial velocity is zero. 
Estimate the velocity of the Space Shuttle after two minutes of flight. 


d) Describe what would happen to a rocket where mwer < mg. 


5. A time independent field F is conservative if V x F = 0. Use this fact to test if the following fields are 
conservative, and derive the corresponding potential U. 


a) Fs = ayz + bz + c, Fy = azz + bz, F; = ary + by 
b) Fy =—ze~*, Fy = ln z, F; =e” +2 
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6. Consider a solid cylinder of mass m and radius r sliding without rolling down the smooth inclined face of a 
wedge of mass M that is free to slide without friction on a horizontal plane floor. Use the coordinates shown 
in the figure. 

a) How far has the wedge moved by the time the cylinder has descended from rest a vertical distance h ? 
b) Now suppose that the cylinder is free to roll down the wedge without slipping. How far does the wedge 


move in this case if the cylinder rolls down a vertical distance h ? 


c) In which case does the cylinder reach the bottom faster? How does this depend on the radius of the cylinder? 


7. Ifthe gravitational field vector is independent of the radial distance within a sphere, find the function describing 


the mass density p (r) of the sphere. 


Chapter 3 


Linear oscillators 


3.1 Introduction 


Oscillations are a ubiquitous feature in nature. Examples are periodic motion of planets, the rise and fall 
of the tides, water waves, pendulum in a clock, musical instruments, sound waves, electromagnetic waves, 
and wave-particle duality in quantal physics. Oscillatory systems all have the same basic mathematical form 
although the names of the variables and parameters are different. The classical linear theory of oscillations 
will be assumed in this chapter since: (1) The linear approximation is well obeyed when the amplitudes of 
oscillation are small, that is, the restoring force obeys Hooke's Law. (2) The Principle of Superposition 
applies. (3) The linear theory allows most problems to be solved explicitly in closed form. This is in contrast 
to non-linear system where the motion can be complicated and even chaotic as discussed in chapter 4. 


3.2 Linear restoring forces 


An oscillatory system requires that there be a stable equilibrium about 
which the oscillations occur. Consider a conservative system with potential 
energy U for which the force is given by U(x) 


F =-VU BL. 


Figure 3.1 illustrates a conservative system that has three locations at 
which the restoring force is zero, that is, where the gradient of the potential 3 
is zero. Stable oscillations occur only around locations 1 and 3 whereas 
the system is unstable at the zero gradient location 2. Point 2 is called a ` 1 
separatrix in that an infinitessimal displacement of the particle from this o os wo ns w as w as wo 
separatrix will cause the particle to diverge towards either minimum 1 or 
3 depending on which side of the separatrix the particle is displaced. 
The requirements for stable oscillations about any point zo are that 
the potential energy must have the following properties. . : 
Stability requirements dimensional potential U(x). 
dU =0 


1) The potential has a stable position for which the restoring force is zero, i.e. (T)ar 


Figure 3.1: Stability for a one- 


2) The potential U must be positive and an even function of displacement x — zo. That is. (2) >0 
n J o 


where n is even. 
The requirement for the restoring force to be linear is that the restoring force for perturbation about a 
stable equilibrium at xo is of the form 
F = —a(xz— zo) = më (3.2) 
The potential energy function for a linear oscillator has a pure parabolic shape about the minimum location, 
that is, 


U = Ska — zo)? (3.3) 
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where xo is the location of the minimum. 

Fortunately, oscillatory systems involve small amplitude oscillations about a stable minimum. For weak 
non-linear systems, where the amplitude of oscillation Ax about the minimum is small, it is useful to make 
a Taylor expansion of the potential energy about the minimum. That is 


¿Y (xo) A Ax? d?U (xo) E Az? d3U (xo) n Az? d*U (20) y 


a a dx 2! dx? 3! das 4! dx* a) 
By definition, at the minimum Woo) = 0, and thus equation 3.3 can be written as 
A 2 42 A 3 73 A 4 J4 
MATSUO Be eo) Be EO (3.5) 


2! da? 3! dr? 4l drt 
For small amplitude oscillations, the system is linear if the second-order A UG) term in equation 3.2 is 
dominant. l 

The linearity for small amplitude oscillations greatly simplifies description of the oscillatory motion and 
complicated chaotic motion is avoided. Most physical systems are approximately linear for small amplitude 
oscillations, and thus the motion close to equilibrium approximates a linear harmonic oscillator. 


3.3 Linearity and superposition 


An important aspect of linear systems is that the solutions obey the Principle of Superposition, that is, for 
the superposition of different oscillatory modes, the amplitudes add linearly. The linearly-damped linear 
oscillator is an example of a linear system in that it involves only linear operators, that is, it can be written 
in the operator form (appendix F.2) 


d? d 2 
— + T— +47 | x(t) = Acoswt (3.6) 
The quantity in the brackets on the left hand side is a linear operator that can be designated by L where 


La(t) = F(t) (3.7) 


An important feature of linear operators is that they obey the principle of superposition. This property 
results from the fact that linear operators are distributive, that is 


Therefore if there are two solutions x1 (t) and xa(t) for two different forcing functions F(t) and F(t) 


a(t) = Fit) (3.9) 
ca (t) = F(t) 


then the addition of these two solutions, with arbitrary constants, also is a solution for linear operators. 
L(a171 + Q2X2) = a, FP; (t) + Qz Fs (t) (3.10) 


In general then 


N N 
L È 0) = (>: ma) (3.11) 


The left hand bracket can be identified as the linear combination of solutions 
N 
x(t) = ye Qin Tn (E) (3.12) 
n=1 


while the driving force is a linear superposition of harmonic forces 


N 
F(t) = Y anFn(t) (3.13) 
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Thus these linear combinations also satisfy the general linear equation 
La(t) = F(t) (3.14) 


Applicability of the Principle of Superposition to a system provides a tremendous advantage for handling 
and solving the equations of motion of oscillatory systems. 


3.4 Geometrical representations of dynamical motion 


The powerful pattern-recognition capabilities of the human brain, coupled with geometrical representations 
of the motion of dynamical systems, provide a sensitive probe of periodic motion. The geometry of the 
motion often can provide more insight into the dynamics than inspection of mathematical functions. A 
system with n degrees of freedom is characterized by locations q;, velocities q;, and momenta p;, in addition 
to the time t and instantaneous energy H(t). Geometrical representations of the dynamical correlations are 
illustrated by the configuration space and phase space representations of these 2n + 2 variables. 


3.4.1 Configuration space (q;, q;, t) 


A configuration space plot shows the correlated motion of two spatial coordinates q; and qj averaged over 
time. An example is the two-dimensional linear oscillator with two equations of motion and solutions 


má +k,x=0 my + kyy =0 (3.15) 
x (t) = Acos (wyt) y (t) = B cos (wyt — ô) (3.16) 


where w = JE. For unequal restoring force constants, ks # ky, the trajectory executes complicated 
Lissajous figures that depend on the angular frequencies wg, wy, and the phase factor ô. When the ratio of 
the angular frequencies along the two axes is rational, that is a is a rational fraction, then the curve will 
repeat at regular intervals as shown in figure 3.2, and this shape depends on the phase difference. Otherwise 


the trajectory uniformly traverses the whole rectangle. 


5-25 8 =3m/5 


Figure 3.2: Configuration plots of (a, y) where x = cos(4t) and y = cos(5t— 6) at four different phase values 
6. The curves are called Lissajous figures 
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3.4.2 State space, (qi, q;,t) 


Visualization of a trajectory is enhanced by correlation of configuration q; and it's corresponding velocity 
q; which specifies the direction of the motion. The state space representation! is especially valuable when 
discussing Lagrangian mechanics which is based on the Lagrangian L(q, q,t). 

The free undamped harmonic oscillator provides a simple illustration of state space. Consider a mass m 
attached to a spring with linear spring constant k for which the equation of motion is 


di 
-kz = më = m = (3.17) 
dx 
By integration this gives 
Manos ales 25 
¿mi + ake =E (3.18) 


The first term in equation 3.18 is the kinetic energy, the second term is the potential energy, and E is the 
total energy which is conserved for this system. This equation can be expressed in terms of the state space 


coordinates as 5 j 
E zx 


+= =] 3.19 
ara ok 
This corresponds to the equation of an ellipse for a state-space plot of y versus x as shown in figure 3.3upper. 
The elliptical paths shown correspond to contours of constant total energy which is partitioned between 
kinetic and potential energy. For the coordinate axis shown, the motion of a representative point will be in 
a clockwise direction as the total oscillator energy is redistributed between potential to kinetic energy. The 
area of the ellipse is proportional to the total energy E. 


3.4.3 Phase space, (qi, pit) 


Phase space, which was introduced by J.W. Gibbs for the field of sta- 
tistical mechanics, provides a fundamental graphical representation in 
classical mechanics. The phase space coordinates qip; are the conju- 
gate coordinates (q, p) and are fundamental to Hamiltonian mechanics 
which is based on the Hamiltonian H (q, p,t). For a conservative system, 
only one phase-space curve passes through any point in phase space x 
like the flow of an incompressible fluid. This makes phase space more 
useful than state space where many curves pass through any location. 
Lanczos [La49] defined an extended phase space using four-dimensional 
relativistic space-time as discussed in chapter 17. 

Since py = mz for the non-relativistic, one-dimensional, linear os- Py 
cillator, then equation 3.19 can be rewritten in the form 


2 2 
P T xX 
—= + -= 1 3.20 
mE (Æ) een) 


This is the equation of an ellipse in the phase space diagram shown in 

Fig.3.3-lower which looks identical to Fig 3.3-upper where the ordinate 

variable py = mz. That is, the only difference is the phase-space coor- 

dinates (£, Pz) replace the state-space coordinates (1,1). State space 

plots are used extensively in this chapter to describe oscillatory mo- 

tion. Although phase space is more fundamental, both state space and Figure 3.3: State space (upper), 
phase space plots provide useful representations for characterizing and and phase space (lower) diagrams, 
elucidating a wide variety of motion in classical mechanics. The follow- for the linear harmonic oscillator. 
ing discussion of the undamped simple pendulum illustrates the general 

features of state space. 


1A universal name for the (q, Å) representation has not been adopted in the literature. Therefore this book has adopted 
the name "state space" in common with reference [Ta05]. Lanczos [La49] uses the term "state space" to refer to the extended 
phase space (q, p,t) discussed in chapter 17. 
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3.4.4 Plane pendulum 


Consider a simple plane pendulum of mass m attached to a string of length l in a uniform gravitational field 
g. There is only one generalized coordinate, 0. Since the moment of inertia of the simple plane-pendulum is 
I = ml?, then the kinetic energy is 


= Imi (3.21) 
and the potential energy relative to the bottom dead center is 
U = mgl (1 — cos 0) (3.22) 
Thus the total energy equals 
E= 20 +mgl(1 — cos 0) = Pi + mgl (1 — cos 0) (3.23) 
2 2ml? 


where £ is a constant of motion. Note that the angular momentum pg is not a constant of motion since the 
angular acceleration pg explicitly depends on 6. 


It is interesting to look at the solutions for the equation of motion for a plane pendulum on a (o, 6) 


state space diagram shown in figure 3.4. The curves shown are equally-spaced contours of constant total 
energy. Note that the trajectories are ellipses only at very small angles where 1— cos 0 ~ 6”, the contours are 
non-elliptical for higher amplitude oscillations. When the energy is in the range 0 < E < 2mgl the motion 
corresponds to oscillations of the pendulum about 9 = 0. The center of the ellipse is at (0,0) which is a 
stable equilibrium point for the oscillation. However, when |E| > 2mg]l there is a phase change to rotational 
motion about the horizontal axis, that is, the pendulum swings around and over top dead center, i.e. it 
rotates continuously in one direction about the horizontal axis. The phase change occurs at E = 2mgl. and 
is designated by the separatrix trajectory. 

Figure 3.4 shows two cycles for 0 to better illustrate 
the cyclic nature of the phase diagram. The closed loops, 
shown as fine solid lines, correspond to pendulum oscil- 
lations about 0 = 0 or 27 for E < 2mgl. The dashed 
lines show rolling motion for cases where the total en- 
ergy E > 2mgl. The broad solid line is the separatrix 
that separates the rolling and oscillatory motion. Note 
that at the separatrix, the kinetic energy and O are zero 
when the pendulum is at top dead center which occurs 
when 0 = +7.The point (7,0) is an unstable equilib- 
rium characterized by phase lines that are hyperbolic 
to this unstable equilibrium point. Note that 0 = +r 
and —T correspond to the same physical point, that is, 
the phase diagram is better presented on a cylindri- 
cal phase space representation since @ is a cyclic vari- 
able that cycles around the cylinder whereas 0 oscillates 
equally about zero having both positive and negative val- Figure 3.4: State space diagram for a plane pendu- 
ues. The state-space diagram can be wrapped around a lum. The @ axis is in units of 7 radians. Note that 
cylinder, then the unstable and stable equilibrium points 0 = +7 and —7 correspond to the same physical 
will be at diametrically opposite locations on the surface point, that is the phase diagram should be rolled 
of the cylinder at Ò = 0. For small oscillations about into a cylinder connected at 0 = +r. 
equilibrium, also called librations, the correlation be- 
tween ô and 0 is given by the clockwise closed loops wrapped on the cylindrical surface, whereas for energies 
|El > 2mgl the positive 0 corresponds to counterclockwise rotations while the negative 0 corresponds to 
clockwise rotations. 

State-space diagrams will be used for describing oscillatory motion in chapters 3 and 4. Phase space is 
used in statistical mechanics in order to handle the equations of motion for ensembles of ~ 10% independent 
particles since momentum is more fundamental than velocity. Rather than try to account separately for 
the motion of each particle for an ensemble, it is best to specify the region of phase space containing the 
ensemble. If the number of particles is conserved, then every point in the initial phase space must transform 
to corresponding points in the final phase space. This will be discussed in chapters 8.3 and 15.2.7. 
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3.5 Linearly-damped free linear oscillator 


3.5.1 General solution 


All simple harmonic oscillations are damped to some degree due to energy dissipation via friction, viscous 
forces, or electrical resistance etc. The motion of damped systems is not conservative in that energy is 
dissipated as heat. As was discussed in chapter 2 the damping force can be expressed as 


Fp(v) = —f(v)v (3.24) 


where the velocity dependent function f(v) can be complicated. Fortunately there is a very large class of 
problems in electricity and magnetism, classical mechanics, molecular, atomic, and nuclear physics, where 
the damping force depends linearly on velocity which greatly simplifies solution of the equations of motion. 
This chapter discusses the special case of linear damping. 

Consider the free simple harmonic oscillator, that is, assuming no oscillatory forcing function, with a 


linear damping term Fp(v) = —bv where the parameter b is the damping factor. Then the equation of 
motion is 
—kx — bit = má (3.25) 
This can be rewritten as 
#+Te+wer =0 (3.26) 
where the damping parameter 
b 
r=— (3.27) 
m 


and the characteristic angular frequency 


wo = e (3.28) 


The general solution to the linearly-damped free oscillator is obtained by inserting the complex trial 
solution z = zoet. Then 


(iw)? zoet + il zoe + wzo = 0 (3.29) 


This implies that 


w — iw —w2 =0 (3.30) 


The solution is 


ee E) (3.31) 


The two solutions w+ are complex conjugates and thus the solutions of the damped free oscillator are 


z= ga a)! + nei BY)! (3.32) 


This can be written as 
z= e (3)t [zie -+ zoet] (3.33) 


where 


o 230 
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Underdamped motion w? = w? — Gy > 0 


When w? > 0, then the square root is real so the solution can be written taking the real part of z which 
gives that equation 3.33 equals 


w(t) = Ae~ (2)! cos (wit — B) (3.35) 
Where A and £ are adjustable constants fit to the initial conditions. Therefore the velocity is given by 
. —I; . T 
a(t) = —Ae~ 2" Jw, sin (wıt — PB) + z 008 (wit — B) (3.36) 


This is the damped sinusoidal oscillation illustrated in figure 3.5upper. The solution has the following 
characteristics: 


a) The oscillation amplitude decreases exponentially with a time constant Tp = 2. 


b) There is a small reduction in the frequency of the oscillation due to the damping leading to w, = 


Figure 3.5: The amplitude-time dependence and state-space diagrams for the free linearly-damped harmonic 
wo 


oscillator. The upper row shows the underdamped system for the case with damping [ = $2. The lower 


row shows the overdamped (5 > wo) [solid line] and critically damped (5 = wo) [dashed line] in both cases 
assuming that initially the system is at rest. 
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Figure 3.6: Real and imaginary solutions w+ of the damped harmonic oscillator. A phase transition occurs 
at IT = 2wo. For T < 2wo (dashed) the two solutions are complex conjugates and imaginary. For T > 2wo, 
(solid), there are two real solutions w and w_ with widely different decay constants where w, dominates 
the decay at long times. 


2 
Overdamped case w? = w? — (4) <0 


A oe R ; 2 
In this case the square root of w? is imaginary and can be expressed as w, = 4 /(L)” — w2. Therefore the 
1 g y 1 2 o 


solution is obtained more naturally by using a real trial solution z = zge** in equation 3.33 which leads to 


two roots 
T a T $ 7 
w+ = z= 5 we 


Thus the exponentially damped decay has two time constants w+ and w_. 


a(t) = [Aye “++ Age?-*] (3.37) 


The time constant + < E thus the first term A,e~“+! in the bracket decays in a shorter time than the 


second term A2e“-*, As illustrated in figure 3.6 the decay rate, which is imaginary when underdamped, i.e. 


a < Wo, bifurcates into two real values w+ for overdamped, ie > Wo. At large times the dominant term 
when overdamped is for wi which has the smallest decay rate, that is, the longest decay constant T4} = Dr 
There is no oscillatory motion for the overdamped case, it slowly moves monotonically to zero as shown in 


fig 3.5lower. The amplitude decays away with a time constant that is longer than 2. 


Critically damped w? = w? — (E =0 


This is the limiting case where 5 = wo For this case the solution is of the form 
a(t) = (A+ B0)0 (4) (3.38) 


This motion also is non-sinusoidal and evolves monotonically to zero. As shown in figure 3.5 the critically- 
damped solution goes to zero with the shortest time constant, that is, largest w. Thus analog electric meters 
are built almost critically damped so the needle moves to the new equilibrium value in the shortest time 
without oscillation. 

It is useful to graphically represent the motion of the damped linear oscillator on either a state space 
(t,x) diagram or phase space (p., 1) diagram as discussed in chapter 3.4. The state space plots for the 
undamped, overdamped, and critically-damped solutions of the damped harmonic oscillator are shown in 
figure 3.5. For underdamped motion the state space diagram spirals inwards to the origin in contrast to 
critical or overdamped motion where the state and phase space diagrams move monotonically to zero. 
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3.5.2 Energy dissipation 


The instantaneous energy is the sum of the instantaneous kinetic and potential energies 


1 1 
E= zee + sha (3.39) 
where x, and & are given by the solution of the equation of motion. 

Consider the total energy of the underdamped system 


Ts 1 
E= ¿mi + gon” (3.40) 
where k = mw. The average total energy is given by substitution for x and ¢ and taking the average over 
one cycle. Since 
a(t) = Ae~ (2)! cos (wit — p) (3.41) 
Then the velocity is given by 


a(t) = —Ae~?" [or sin (wit — 8) + E cos (wit — a) (3.42) 


Inserting equations 3.41 and 3.42 into 3.40 gives a small amplitude oscillation about an exponential decay for 
the energy E. Averaging over one cycle and using the fact that (sin 0 cos 0) = 0, and ({sin 0?) = ([cos 01?) = 


>, gives the time-averaged total energy as 


2 
BE) =eT ( 2mA%u? + Ima? (E sm Aw? 3.43 
(E) =e qm wy + am 5 + am wi (3.43) 
which can be written as 

(E) = Epe** (3.44) 


Note that the energy of the linearly damped free oscillator decays away with a time constant T = z. That 
is, the intensity has a time constant that is half the time constant for the decay of the amplitude of the 
transient response. Note that the average kinetic and potential energies are identical, as implied by the 
Virial theorem, and both decay away with the same time constant. This relation between the mean life 7 
for decay of the damped harmonic oscillator and the damping width term T occurs frequently in physics. 
The damping of an oscillator usually is characterized by a single parameter Q called the Quality Factor 


where i ; 
__ Energy stored in the oscillator 


= 3.45 
Energy dissipated per radian ( ) 
The energy loss per radian is given by 
E 1 ET ET 
AE = a AÑ (3.46) 
w w 
rus (3) 


where the numerator wi = 4/ w2 — (ey is the frequency of the free damped linear oscillator. 
Thus the Quality factor Q equals 


Q= == (3.47) Typical Q factors 

Earth, for earthquake wave 

The larger the Q factor, the less damped is the system, and the Piano string 

greater is the number of cycles of the oscillation in the damped Crystal in digital watch 

wave train. Chapter 3.11.3 shows that the longer the wave train, Microwave cavity 

that is the higher is the Q factor, the narrower is the frequency Excited atom 

distribution around the central value. The Méssbauer effect in Neutron star 

nuclear physics provides a remarkably long wave train that can LIGO laser 

be used to make high precision measurements. The high-Q pre- | Mössbauer effect in nucleus 


cision of the LIGO laser interferometer was used in the first suc- 
cessful observation of gravity waves in 2015. Table 3.1: Typical Q factors in nature. 
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3.6 Sinusoidally-drive, linearly-damped, linear oscillator 


The linearly-damped linear oscillator, driven by a harmonic driving force, is of considerable importance to 
all branches of science and engineering. The equation of motion can be written as 


F(t 
ë +r + wir = FO (3.48) 


where F(t) is the driving force. For mathematical simplicity the driving force is chosen to be a sinusoidal 
harmonic force. The solution of this second-order differential equation comprises two components, the 
complementary solution (transient response), and the particular solution (steady-state response). 


3.6.1 Transient response of a driven oscillator 


The transient response of a driven oscillator is given by the complementary solution of the above second-order 
differential equation 


#+Te+wer =0 (3.49) 


which is identical to the solution of the free linearly-damped harmonic oscillator. As discussed in section 3.5, 
the solution of the linearly-damped free oscillator is given by the real part of the complex variable z where 


z=e Y! [zett + ze] (3.50) 


and 


wy = fu? — er (3.51) 


Underdamped motion wi = w? — ra >0: When w? > 0, then the square root is real so the transient 


solution can be written taking the real part of z which gives 


F 
a(t)p = 207% cos (wt) (3.52) 
m 
The solution has the following characteristics: 
a) The amplitude of the transient solution decreases exponentially with a time constant Tp = 2 while 
the energy decreases with a time constant of 5. 


b) There is a small downward frequency shift in that w] = 4/ w2 — ee 


Overdamped case w? = w?— cer <0: In this case the square root is imaginary, which can be expressed 
as Ww, = (2) — w? which is real and the solution is just an exponentially damped one 
F , , 
a(t)p = —e7 7t [ert + eet] (3.53) 
m 


There is no oscillatory motion for the overdamped case, it slowly moves monotonically to zero. The total 
energy decays away with two time constants greater than 5. 


Critically damped wi = w? — E) =0: For this case, as mentioned for the damped free oscillator, the 
solution is of the form 


a(t)p = (A + Bt) e77 (3.54) 


The critically-damped system has the shortest time constant. 
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3.6.2 Steady state response of a driven oscillator 


The particular solution of the differential equation gives the important steady state response, x(t)s to the 
forcing function. Consider that the forcing term is a single frequency sinusoidal oscillation. 


F(t) = Fo cos(wt) (3.55) 


Thus the particular solution is the real part of the complex variable z which is a solution of 


Fa . 
Z +r + wz = Se (3.56) 
m 
A trial solution is i 
z = ze (3.57) 
This leads to the relation F 
—w* zo + iwl zo + wzo = = (3.58) 
m 


Multiplying the numerator and denominator by the factor (w8 — w?) — ilw gives 


Es Fo 
qdo = m 2 2 a 
(w2 — w?) + ilw a (w2 — w2)? + (Tw)? [(wo Y ) Tu] (3.59) 


20 = 


The steady state solution x(t)s thus is given by the real part of z, that is 


[(w% — w?) coswt +Tw sin wt] (3.60) 


This can be expressed in terms of a phase ô defined as 


T 
w — w 
As shown in figure 3.7, the hypotenuse of the triangle equals 

y (w — w2)? + ([w)”. Thus 


cos ô = (3.62) 


and 
sin 6 = ——— (3.63) 
y/ (03 — w}? + (Cw)? 


The phase 6 represents the phase difference between the 

driving force and the resultant motion. For a fixed wo the 

phase 6 = 0 when w = 0, and increases to ô = 5 when 

w = Wo. For w > wo the phase 6 > 7 as w — oo. Figure 3.7: Phase between driving force and 
The steady state solution can be re-expressed in terms of resultant motion. 

the phase shift 6 as 


Fo 
zx (t)g = -M [cos ô cos wt + sin ô sin wt] 
2 
y (03 — 2) + (Tw) 
Fo 
= === COS (wt — ô) (3.64) 
(603 2) + (Tw)? 
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Figure 3.8: Amplitude versus time, and state space plots of the transient solution (dashed) and total solution 
(solid) for two cases. The upper row shows the case where the driving frequency w = = while the lower row 
shows the same for the case where the driving frequency w = 5w;. 


3.6.3 Complete solution of the driven oscillator 


To summarize, the total solution of the sinusoidally forced linearly-damped harmonic oscillator is the sum 
of the transient and steady-state solutions of the equations of motion. 


u(t) Total = a(t)r + x(t)s (3.65) 


For the underdamped case, the transient solution is the complementary solution 


Fi 
x(t)r = oo cos (wit — 8) (3.66) 
where wy = q/w? — Cr The steady-state solution is given by the particular solution 
Fo 
x(t) y = ——— cos (wt — 0) (3.67) 


(3 — w2)? + (Lu)? 


Note that the frequency of the transient solution is w which in general differs from the driving frequency 
w. The phase shift P — ô for the transient component is set by the initial conditions. The transient response 
leads to a more complicated motion immediately after the driving function is switched on. Figure 3.8 
illustrates the amplitude time dependence and state space diagram for the transient component, and the 
total response, when the driving frequency is either w = 4 or w = 5w1. Note that the modulation of the 
steady-state response by the transient response is unimportant once the transient response has damped out 
leading to a constant elliptical state space trajectory. For cases where the initial conditions are x = t = 0 
then the transient solution has a relative phase difference 3 — 6 = 7 radians at t = 0 and relative amplitudes 
such that the transient and steady-state solutions cancel at t = 0. 

The characteristic sounds of different types of musical instruments depend very much on the admixture 
of transient solutions plus the number and mixture of oscillatory active modes. Percussive instruments, such 
as the piano, have a large transient component. The mixture of transient and steady-state solutions for 
forced oscillations occurs frequently in studies of RLC networks in electrical circuit analysis. 
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3.6.4 Resonance 


The discussion so far has discussed the role of the transient and steady-state solutions of the driven damped 
harmonic oscillator which occurs frequently is science, and engineering. Another important aspect is reso- 
nance that occurs when the driving frequency w approaches the natural frequency w1 of the damped system. 
Consider the case where the time is sufficient for the transient solution to have decayed to zero. 

Figure 3.9 shows the amplitude and phase for the steady- 
state response as w goes through a resonance as the driving 


frequency is changed. The steady-states solution of the a A 

driven oscillator follows the driving force when w << wo in „ [F/m LL Q=30 
that the phase difference is zero and the amplitude is just 

Lo. The response of the system peaks at resonance, while ° n= Q= 10 


for w >> wo the harmonic system is unable to follow the 

more rapidly oscillating driving force and thus the phase of j 

the induced oscillation is out of phase with the driving force 

and the amplitude of the oscillation tends to zero. s 
Note that the resonance frequency for a driven damped 

oscillator, differs from that for the undriven damped oscilla- 

tor, and differs from that for the undamped oscillator. The 

natural frequency for an undamped harmonic oscillator 

is given by 


e (3.68) | 


m E ? 7 
00 02 04 06 08 10 12 14 16 18 20 22 24 


The transient solution is the same as damped free os- 
cillations of a damped oscillator and has a frequency of 
the system w , given by 


T 2 
wt == (5) (3.69) 


That is, damping slightly reduces the frequency. 

For the driven oscillator the maximum value of the 
steady-state amplitude response is obtained by taking the 
maximum of the function z(t)s, that is when “4% = 0. This 
occurs at the resonance angular frequency wr where 


Figure 3.9: Resonance behavior for the 


2 _ 29 (E) linearly-damped, harmonically driven, linear 
= 


WR Z 3 (3.70) oscillator. 


No resonance occurs if wå —2 (5) * < 0 since then w R is imaginary and the amplitude decreases monotonically 
with increasing w. Note that the above three frequencies are identical if T = 0 but they differ when Tr > 0 
and wp < w1 < wo. 
For the driven oscillator it is customary to define the quality factor Q as 
WR 
== 3.71 

(3.71) 
When Q >> 1 the system has a narrow high resonance peak. As the damping increases the quality factor 
decreases leading to a wider and lower peak. The resonance disappears when Q <1. 


3.6.5 Energy absorption 


Discussion of energy stored in resonant systems is best described using the steady state solution which is 
dominant after the transient solution has decayed to zero. Then 


[(w6 — w?) coswt + Tw sin wt] (3.72) 
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This can be rewritten as 


x(t)s = Ae cos wt + Agys sin wt (3.73) 
where the elastic amplitude 
Fo 
Aa = m (wi — w?) (3.74) 


while the absorptive amplitude 


Aabs = m Tw (3.75) 


Figure 3.10 shows the behavior of the absorptive and 
elastic amplitudes as a function of angular frequency w. 
The absorptive amplitude is significant only near res- y. C 
onance whereas the elastic amplitude goes to zero at F/m| pa 
resonance. Note that the full width at half maximum of | a 
the absorptive amplitude peak equals I. 

The work done by the force Fo cos wt on the oscillator 
is 


W = [Fe = | Fiat (3.76) 
Thus the absorbed power P(t) is given by 
dW ; 


The steady state response gives a velocity 


Sia =w Aaina Aga conte et) Figure 3.10: Elastic (solid) and absorptive 


(dashed) amplitudes of the steady-state solution 
for T = 0.10w 9. 
P(t) = Fo cos wt [—wA,; sin wt + wAabs coswt] (3.79) 


Thus the steady-state instantaneous power input is 


The absorptive term steadily absorbs energy while the elastic term oscillates as energy is alternately absorbed 
or emitted. The time average over one cycle is given by 


(P) = Fo [wa (cos wt sin wt) + wAabs ((cos wt)? | (3.80) 


where (cos wi sinwt) and (cos wt?) are the time average over one cycle. The time averages over one complete 
cycle for the first term in the bracket is 


—wAg (coswtsinwt) = 0 (3.81) 
while for the second term 
TA 1 
(cos wt?) = F cos wt*dt = = (3.82) 
TJ, 2 


o 


Thus the time average power input is determined by only the absorptive term 
F? Tw? 
2M (w3 — w2)? + (Tw)? 


1 
(P) = 5FowAats = (3.83) 


This shape of the power curve is a classic Lorentzian shape. Note that the maximum of the average kinetic 
energy occurs at wg = wo which is different from the peak of the amplitude which occurs at w? = w? — (ar. 
The potential energy is proportional to the amplitude squared, i.e. x% which occurs at the same angular 
frequency as the amplitude, that is, wp = wh = w? — 2 ew The kinetic and potential energies resonate 


at different angular frequencies as a result of the fact that the driven damped oscillator is not conservative 
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because energy is continually exchanged between the oscillator and the driving force system in addition to 
the energy dissipation due to the damping. 
When w ~ we >>T, then the power equation simplifies since 


Ch — w?) = (wo + w) (wo — w) ~ 2wo (wo — w) (3.84) 


Therefore 

= Fè T 

7 8m (wo = w)? + (oe 
This is called the Lorentzian or Breit-Wigner shape. The half power points are at a frequency difference 
from resonance of +Aw where 


(P) (3.85) 


T 
Aw = |wo — w| = tz (3.86) 
Thus the full width at half maximum of the Lorentzian curve equals T. Note that the Lorentzian has a 
narrower peak but much wider tail relative to a Gaussian shape. At the peak of the absorbed power, the 


absorptive amplitude can be written as 


Aabs(w = wo) = 3 (3.87) 


That is, the peak amplitude increases with increase in Q. This explains the classic comedy scene where the 
soprano shatters the crystal glass because the highest quality crystal glass has a high Q which leads to a 
large amplitude oscillation when she sings on resonance. 

The mean lifetime 7 of the free linearly-damped harmonic oscillator, that is, the time for the energy of 
free oscillations to decay to 1/e, was shown to be related to the damping coefficient [ by 


THs (3.88) 


Therefore we have the classical uncertainty principle for the linearly-damped harmonic oscillator 
that the measured full-width at half maximum of the energy resonance curve for forced oscillation and the 
mean life for decay of the energy of a free linearly-damped oscillator are related by 


T=1 (3.89) 


This relation is correct only for a linearly-damped harmonic system. Comparable relations between the 
lifetime and damping width exist for different forms of damping. 

One can demonstrate the above line width and decay time relationship using an acoustically driven 
electric guitar string. Similarily, the width of the electromagnetic radiation is related to the lifetime for 
decay of atomic or nuclear electromagnetic decay. This classical uncertainty principle is exactly the same 
as the one encountered in quantum physics due to wave-particle duality. In nuclear physics it is difficult to 
measure the lifetime of states when 7 < 107!8s. For shorter lifetimes the value of I can be determined from 
the shape of the resonance curve which can be measured directly when the damping is large. 


3.1 Example: Harmonically-driven series RLC circuit 


The harmonically-driven, resonant, series RLC circuit, is encountered fre- 
quently in AC circuits. Kirchhoff’s Rules applied to the series RLC circuit 
lead to the differential equation 


Lá+Rá+ $ = Vosinut 
where q is charge, L is the inductance, C is the capacitance, R is the resistance, Y. R G 
and the applied voltage across the circuit is V (w) = Vosinwt. The linearity of 
the network allows use of the phasor approach which assumes that the current 
I = Ine’, the voltage V = We (“t+9), and the impedance is a complex number 
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Z= Pe” where 6 is the phase difference between the voltage and the current. For this circuit the impedance 


is given by 
1 
Z=R+iluL-— 
vifo =) 


Because of the phases involved in this RLC circuit, at resonance the maximum voltage across the resistor 
occurs at a frequency of wr = wo, across the capacitor the maximum voltage occurs at a frequency Wa = 


2 ‘ f 2 
wi — tz, and across the inductor L the maximum voltage occurs at a frequency w?, = TN where wi = 15 


2L 
is the resonance angular frequency when R = 0. Thus these resonance frequencies differ when R > 0. 


3.7 Wave equation 


Wave motion is a ubiquitous feature in nature. Mechanical wave motion is manifest by transverse waves 
on fluid surfaces, longitudinal and transverse seismic waves travelling through the Earth, and vibrations of 
mechanical structures such as suspended cables. Acoustical wave motion occurs on the stretched strings of 
the violin, as well as the cavities of wind instruments. Wave motion occurs for deformable bodies where 
elastic forces acting between the nearest-neighbor atoms of the body exert time-dependent forces on one 
another. Electromagnetic wave motion includes wavelengths ranging from 10°m radiowaves, to 10713m y- 
rays. Matter waves are a prominent feature of quantum physics. All these manifestations of waves exhibit 
the same general features of wave motion. Chapter 14 will introduce the collective modes of motion, called 
the normal modes, of coupled, many-body, linear oscillators which act as independent modes of motion. 
The basic elements of wavemotion are introduced at this juncture because the equations of wave motion are 
simple, and wave motion features prominently in several chapters throughout this book. 

Consider a travelling wave in one dimension for a linear system. If the wave is moving, then the wave 
function W (x, t) describing the shape of the wave, is a function of both x and t. The instantaneous amplitude 
of the wave W (x,t) could correspond to the transverse displacement of a wave on a string, the longitudinal 
amplitude of a wave on a spring, the pressure of a longitudinal sound wave, the transverse electric or magnetic 
fields in an electromagnetic wave, a matter wave, etc. If the wave train maintains its shape as it moves, then 
one can describe the wave train by the function f ($) where the coordinate ¢ is measured relative to the 
shape of the wave, that is, it could correspond to the phase of a crest of the wave. Consider that f(¢ = 0), 
corresponds to a constant phase, e.g. the peak of the travelling pulse, then assuming that the wave travels 
at a phase velocity v in the x direction and the peak is at x = 0 for t = 0, then it is at x = vt at time t. 
That is, a point with phase ¢ fixed with respect to the waveform shape of the wave profile f(¢) moves in 
the +a direction for $ = x — vt and in —z direction for ọ = x + vt. 

General wave motion can be described by solutions of a wave equation. The wave equation can be 
written in terms of the spatial and temporal derivatives of the wave function W(xt). Consider the first partial 
derivatives of U(at) = f(x F vt) = f(¢). 


ow dvðp av 


o ao (3.90) 
a aw dvdae__ dw 
dt dd at | dé (3e 
Factoring out o for the first derivatives gives 
Ow — ow 
OL = ae. (3.92) 


The sign in this equation depends on the sign of the wave velocity making it not a generally useful formula. 
Consider the second derivatives 
Pv Pvdg dv 


ðr? dø Ox de’ (2:20) 


and 
OU B dv ðo B y AV 


OL = de ðt = +v de (3.94) 


3.8. TRAVELLING AND STANDING WAVE SOLUTIONS OF THE WAVE EQUATION 69 


Factoring out i gives 
2 2 

ow 2 low (3.95) 

Ox? v? OH? 
This wave equation in one dimension for a linear system is independent of the sign of the velocity. There 
are an infinite number of possible shapes of waves both travelling and standing in one dimension, all of these 
must satisfy this one-dimensional wave equation. The converse is that any function that satisfies this one 
dimensional wave equation must be a wave in this one dimension. 

The Wave Equation in three dimensions is 
Py PU PU 1#% 
Ox? T Oy? T 022 v2 ðt? 
There are an unlimited number of possible solutions Y to this wave equation, any one of which corresponds 
to a wave motion with velocity v. 

The Wave Equation is applicable to all manifestations of wave motion, both transverse and longitudinal, 
for linear systems. That is, it applies to waves on a string, water waves, seismic waves, sound waves, 
electromagnetic waves, matter waves, etc. If it can be shown that a wave equation can be derived for any 
system, discrete or continuous, then this is equivalent to proving the existence of waves of any waveform, 
frequency, or wavelength travelling with the phase velocity given by the wave equation. [Cra65] 


V?V = 


(3.96) 


3.8 Travelling and standing wave solutions of the wave equation 


The wave equation can exhibit both travelling and standing-wave solutions. Consider a one-dimensional 
travelling wave with velocity v having a specific wavenumber k = 2r, Then the travelling wave is best 
written in terms of the phase of the wave as 


U (a, t) = A(kje ECF) = Afk) eF (3.97) 


where the wave number k = az, with A being the wave length, and angular frequency w = kv. This particular 


solution satisfies the wave equation and corresponds to a travelling wave with phase velocity v = ¿2 in the 
positive or negative direction x depending on whether the sign is negative or positive. Assuming that the 
superposition principle applies, then the superposition of these two particular solutions of the wave equation 
can be written as 


W(a,t) = A(k)(e%**—“2) 4 etlkatut)) — Afke? (et 4 eit) = 2A(k)e™** coswt (3.98) 


Thus the superposition of two identical single wavelength travelling waves propagating in opposite directions 
can correspond to a standing wave solution. Note that a standing wave is identical to a stationary normal 
mode of the system discussed in chapter 14. This transformation between standing and travelling waves can 
be reversed, that is, the superposition of two standing waves, i.e. normal modes, can lead to a travelling 
wave solution of the wave equation. 

Discussion of waveforms is simplified when using either of the following two limits. 

1) The time dependence of the waveform at a given location x = xo which can be expressed using a 
Fourier decomposition, appendix /.2, of the time dependence as a function of angular frequency w = nwo. 


00 00 
Vds Y AS SN Bla) eet (3.99) 
n=—00 n=-—o0 
2) The spatial dependence of the waveform at a given instant t = to which can be expressed using a 
Fourier decomposition of the spatial dependence as a function of wavenumber k = nko 


T(z, to) = y Apeinhor—wito) — 5 Cn (to) eto? (3.100) 


n=— 00 n=— 00 


The above is applicable both to discrete, or continuous linear oscillator systems, e.g. waves on a string. 

In summary, stationary normal modes of a system are obtained by a superposition of travelling waves 
travelling in opposite directions, or equivalently, travelling waves can result from a superposition of stationary 
normal modes. 
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3.9 Waveform analysis 


3.9.1 Harmonic decomposition 


As described in appendix J, when superposition applies, then a 
Fourier series decomposition of the form 3.101 can be made of 
any periodic function where 


N 
F(t) = e Qn cos(nwot + Qn) (3.101) 


n=1 


A more general Fourier Transform can be made for an aperiodic 
function where 


F(t) = fo (w) cos(wt + ¢ (w))dt (3.102) 


Any linear system that is subject to the forcing function F(t), 
has an output that can be expressed as a linear superposition 
of the solutions of the individual harmonic components of the 
forcing function. Fourier analysis of periodic waveforms in terms 
of harmonic trigonometric functions plays a key role in describing 
oscillatory motion in classical mechanics and signal processing 
for linear systems. Fourier’s theorem states that any arbitrary 
forcing function F(t) can be decomposed into a sum of harmonic 
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Figure 3.11: The time and frequency rep- 
resentations of a system exhibiting beats. 


terms. As a consequence two equivalent representations can be used to describe signals and waves; the first 
is in the time domain which describes the time dependence of the signal. The second is in the frequency 
domain which describes the frequency decomposition of the signal. Fourier analysis relates these equivalent 


representations. 
For example, the superposition of two equal intensity har- 
monic oscillators in the time domain is given by 


y(t) = Acos(w,t) + Acos (wat) 


= 2Acos (2) ) cos (22) i{3.103) 


which leads to the phenomenon of beats as illustrated for both 
the time domain and frequency domain in figure 3.11. 


3.9.2 The free linearly-damped linear oscilla- 
tor 


The response of the free, linearly-damped, linear oscillator is one 
of the most frequently encountered waveforms in science and thus 
it is useful to investigate the Fourier transform of this waveform. 
The waveform amplitude for the underdamped case, shown in 
figure 3.5, is given by equation (3.35), that is 


f(t) = Ae~?*cos (wit — ô) t>0 (3.104) 
f(t) = 0 t<0 (3.105) 
where w? = w8 — ay and where wo is the angular frequency of 


the undamped system. The Fourier transform is given by 


w A PA we 0 — il w ; 
CA A ae 


which is complex and has the famous Lorentz form. 
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Figure 3.12: The intensity f(t)? and 
Fourier transform |G(w)|’ of the free 
linearly-underdamped harmonic oscillator 
with wo = 10 and damping [ = 1. 
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The intensity of the wave gives 
lf (t)|? = Ae cos? (wit — ô) (3.107) 
IG (w)? 


A 
(w? — w?) + (Tw)? 


(3.108) 


Note that since the average over 27 of cos? = 4, then the average over the cos? (wıt — 5) term gives the 


intensity T (t) = Ae Tt which has a mean lifetime for the decay of r = 4. The |G (w)|? distribution has the 
classic Lorentzian shape, shown in figure 3.12, which has a full width at half-maximum, FWHM, equal to I. 
Note that G (w) is complex and thus one also can determine the phase shift 6 which is given by the ratio of 


the imaginary to real parts of equation 3.105, i.e. tan ô = (AT 
TL 


The mean lifetime of the exponential decay of the intensity can be determined either by measuring 7 
from the time dependence, or measuring the FWHM T = 2 of the Fourier transform |G (w)|?. In nuclear 
and atomic physics excited levels decay by photon emission with the wave form of the free linearly-damped, 
linear oscillator. Typically the mean lifetime 7 usually can be measured when 7 > 10712%s whereas for 
shorter lifetimes the radiation width I becomes sufficiently large to be measured. Thus the two experimental 
approaches are complementary. 


3.9.3 Damped linear oscillator subject to an arbitrary periodic force 


Fourier’s theorem states that any arbitrary forcing function F(t) can be decomposed into a sum of harmonic 
terms. Consider the response of a damped linear oscillator to an arbitrary periodic force. 


N 
F(t) = Y an Fo (wn) cos (wnt + bn) (3.109) 


n=0 


For each harmonic term w, the response of a linearly-damped linear oscillator to the forcing function 
F(t) = Fo (w) cos(wyt) is given by equation (3.65 — 67) to be 


x(t) Total = x(t)r + x(t)s 
= Polen) a ens (wit —5n)+ 1 cos (wnt — bn) (3.110) 


(003 — 102)" + (wn)? 
The amplitude is obtained by substituting into (3.110) the derived values Folen) from the Fourier analysis. 


3.2 Example: Vibration isolation 


Frequently it is desired to isolate instrumentation from the 
influence of horizontal and vertical external vibrations that exist 
in the environment. One arrangement to achieve this isolation 
is to mount a heavy base of mass m on weak springs of spring 
constant k plus weak damping. The response of this system is 
given by equation 3.109 which exhibits a resonance at the angu- 


RIGID TABLETOP 


_damper 


2 ; ; >— soft spring —— 
lar frequency wp = wi — 2(5)” associated with each resonant 5 ton 


frequency wo of the system. For each resonant frequency the sys- [E => 
tem amplifies the vibrational amplitude for angular frequencies 

close to resonance that is, below V2 wo, while it attenuates the Seismic isolation of an optical bench. 
vibration roughly by a factor of (22)? at higher frequencies. To 

avoid the amplification near the resonance it is necessary to make wo very much smaller than the frequency 
range of the vibrational spectrum and have a moderately high Q value. This is achieved by use a very heavy 
base and weak spring constant so that wo is very small. A typical table may have the resonance frequency 
at 0.5Hz which is well below typical perturbing vibrational frequencies, and thus the table attenuates the 
vibration by 99% at 5Hz and even more attenuation for higher frequency perturbations. This principle is 
used extensively in design of vibration-isolation tables for optics or microbalance equipment. 
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3.10 Signal processing 


It has been shown that the response of the linearly-damped linear oscillator, subject to any arbitrary periodic 
force, can be calculated using a frequency decomposition, (Fourier analysis), of the force, appendix I. The 
response also can be calculated using a time-ordered discrete-time sampling of the pulse shape; that is, the 
Green’s function approach, appendix J. The linearly-damped, linear oscillator is the simplest example of 
a linear system that exhibits both resonance and frequency-dependent response. Typically physical linear 
systems exhibit far more complicated response functions having multiple resonances. For example, an au- 
tomobile suspension system involves four wheels and associated springs plus dampers allowing the car to 
rock sideways, or forward and backward, in addition to the up-down motion, when subject to the forces 
produced by a rough road. Similarly a suspension bridge or aircraft wing can twist as well as bend due to 
air turbulence, or a building can undergo complicated oscillations due to seismic waves. An acoustic system 
exhibits similar complexity. Signal analysis and signal processing is of pivotal importance to elucidating the 
response of complicated linear systems to complicated periodic forcing functions. Signal processing is used 
extensively in engineering, acoustics, and science. 

The response of a low-pass filter, such as an R-C circuit or a coaxial cable, to a input square wave, 
shown in figure 3.13, provides a simple example of the relative advantages of using the complementary 
Fourier analysis in the frequency domain, or the Green’s discrete-function analysis in the time domain. The 
response of a repetitive square-wave input signal is shown in the time domain plus the Fourier transform to 
the frequency domain. The middle curves show the time dependence for the response of the low-pass filter 
to an impulse J (t) and the corresponding Fourier transform H(w). The output of the low-pass filter can 
be calculated by folding the input square wave and impulse time dependence in the time domain as shown 
on the left or by folding of their Fourier transforms shown on the right. Working in the frequency domain 
the response of linear mechanical systems, such as an automobile suspension or a musical instrument, as 
well as linear electronic signal processing systems such as amplifiers, loudspeakers and microphones, can 
be treated as black boxes having a certain transfer function H(w,@) describing the gain and phase shift 
versus frequency. That is, the output wave frequency decomposition is 


G(wW) output aE H(w, o) Glw)input (3.111) 


Working in the time domain, the the low-pass system has an impulse response /(t) = e77, which is the 
Fourier transform of the transfer function H(w,@). In the time domain 


Y(t)output = i u(r) - I(t —7)dr (3.112) 


— 00 


This is shown schematically in figure 3.13. The Fourier transformation connects the three quantities in the 
time domain with the corresponding three in the frequency domain. For example, the impulse response of 
the low-pass filter has a fall time of 7 which is related by a Fourier transform to the width of the transfer 
function. Thus the time and frequency domain approaches are closely related and give the same result for 
the output signal for the low-pass filter to the applied square-wave input signal. The result is that the 
higher-frequency components are attenuated leading to slow rise and fall times in the time domain. 

Analog signal processing and Fourier analysis were the primary tools to analyze and process all forms of 
periodic motion during the 20*” century. For example, musical instruments, mechanical systems, electronic 
circuits, all employed resonant systems to enhance the desired frequencies and suppress the undesirable 
frequencies and the signals could be observed using analog oscilloscopes. The remarkable development of 
computing has enabled use of digital signal processing leading to a revolution in signal processing that has 
had a profound impact on both science and engineering. The digital oscilloscope, which can sample at fre- 
quencies above 10%Hz, has replaced the analog oscilloscope because it allows sophisticated analysis of each 
individual signal that was not possible using analog signal processing. For example, the analog approach in 
nuclear physics used tiny analog electric signals, produced by many individual radiation detectors, that were 
transmitted hundreds of meters via carefully shielded and expensive coaxial cables to the data room where 
the signals were amplified and signal processed using analog filters to maximize the signal to noise in order to 
separate the signal from the background noise. Stray electromagnetic radiation picked up via the cables sig- 
nificantly degraded the signals. The performance and limitations of the analog electronics severely restricted 
the pulse processing capabilities. Digital signal processing has rapidly replaced analog signal processing. 
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Figure 3.13: Response of an RC electrical circuit to an input square wave. The upper row shows the time 
and the exponential-form frequency representations of the square-wave input signal. The middle row gives 
the impulse response, and corresponding transfer function for the RC circuit. The bottom row shows the 
corresponding output properties in both the time and frequency domains 


Analog to digital detector circuits are built directly into the electronics for each individual detector so that 
only digital information needs to be transmitted from each detector to the analysis computers. Computer 
processing provides unlimited and flexible processing capabilities for the digital signals greatly enhancing 
the response and sensitivity of our detector systems. Digital CD and DVD disks are common application of 
digital signal processing. 


3.11 Wave propagation 


Wave motion typically involves a packet of waves encompassing a finite number of wave cycles. Information 
in a wave only can be transmitted by starting, stopping, or modulating the amplitude of a wave train, which 
is equivalent to forming a wave packet. For example, a musician will play a note for a finite time, and this 
wave train propagates out as a wave packet of finite length. You have no information as to the frequency 
and amplitude of the sound prior to the wave packet reaching you, or after the wave packet has passed you. 
The velocity of the wavelets contained within the wave packet is called the phase velocity. For a dispersive 
system the phase velocity of the wavelets contained within the wave packet is frequency dependent and the 
shape of the wave packet travels at the group velocity which usually differs from the phase velocity. If 
the shape of the wave packet is time dependent, then neither the phase velocity, which is the velocity of the 
wavelets, nor the group velocity, which is the velocity of an instantaneous point fixed to the shape of the 
wave packet envelope, represent the actual velocity of the overall wavepacket. 

A third wavepacket velocity, the signal velocity, is defined to be the velocity of the leading edge of the 
energy distribution, and corresponding information content, of the wave packet. For most linear systems 
the shape of the wave packet is not time dependent and then the group and signal velocities are identical. 
However, the group and signal velocities can be very different for non-linear systems as discussed in chapter 
4.7. Note that even when the phase velocity of the waves within the wave packet travels faster than the group 
velocity of the shape, or the signal velocity of the energy content of the envelope of the wave packet, the 
information contained in a wave packet is only manifest when the wave packet envelope reaches the detector 
and this energy and information travel at the signal velocity. The modern ideas of wave propagation, 
including Hamilton’s concept of group velocity, were developed by Lord Rayleigh when applied to the theory 
of sound[Ray1887]. The concept of phase, group, and signal velocities played a major role in discussion of 
electromagnetic waves as well as de Broglie’s development of wave-particle duality in quantum mechanics. 
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3.11.1 Phase, group, and signal velocities of wave packets 


The concepts of wave packets, as well as their phase, group, and signal velocities, are of considerable impor- 
tance for propagation of information and other manifestations of wave motion in science and engineering. 
This importance warrants further discussion at this juncture. 

Consider a particular k,w, component of a one-dimensional wave, 


q(x, t) = Eet(ho+et) (3.113) 
The argument of the exponential is called the phase ¢ of the wave where 
$ = kz — wt (3.114) 


If we move along the x axis at a velocity such that the phase is constant then we perceive a stationary 
pattern in this moving frame. The velocity of this wave is called the phase velocity. To ensure constant 
phase requires that ¢ is constant, or assuming real k and w 


wdt = kdx (3.115) 
Therefore the phase velocity is defined to be 
Uphase = > (3.116) 


The velocity discussed so far is just the phase velocity of the individual wavelets at the carrier frequency. If 
k or w are complex then one must take the real parts to ensure that the velocity is real. 

If the phase velocity of a wave is dependent on the wavelength, that is, Upnase (k), then the system is 
said to be dispersive in that the wave is dispersed according the wavelength. The simplest illustration of 
dispersion is the refraction of light in glass prism which leads to dispersion of the light into the spectrum of 
wavelengths. Dispersion leads to development of wave packets that travel at group and signal velocities that 
usually differ from the phase velocity. To illustrate this behavior, consider two equal amplitude travelling 
waves having slightly different wave number k and angular frequency w. Superposition of these waves gives 


Gat) = AC ra raw) (3.117) 
Alta] feine] 4 eile- 
= after Abie (or td cok, Aeg 


This corresponds to a wave with the average carrier frequency modulated by the cosine term which has a 
wavenumber of Ak and angular frequency Aw, that is, this is the usual example of beats. The cosine term 
modulates the average wave producing wave packets as shown in figure 3.11. The velocity of these wave 
packets is called the group velocity given by requiring that the phase of the modulating term is constant, 
that is 


— dx = — dt (3.118) 
Thus the group velocity is given by 
dx Aw 
Ugroup — dt == Ak 
If dispersion is present then the group velocity vgroup = a does not equal the phase velocity Upnase = $- 
Expanding the above example to superposition of n waves gives 


(3.119) 


E: ii (3.120) 
r=1 


In the event that n — oo and the frequencies are continuously distributed, then the summation is replaced 
by an integral 


q(x,t) = / A(k) et? +4) dk (3.121) 
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where the factor A(k) represents the distribution amplitudes of the component waves, that is the spectral 
decomposition of the wave. This is the usual Fourier decomposition of the spatial distribution of the wave. 

Consider an extension of the linear superposition of two waves to a well defined wave packet where the 
amplitude is nonzero only for a small range of wavenumbers kp + Ak. 


ko+Ak i 
q(x, t) = i e A(h)e eat) dk (3.122) 
pa 


This functional shape is called a wave packet which only has meaning if Ak << ko. The angular frequency 
can be expressed by making a Taylor expansion around ko 


d 
w(k) = w(ko) E E] (k-ko) +... (3.123) 
dk J ko 
For a linear system the phase then reduces to 
dw 
ka — wt = (kox = wot) + (k = ko)z = (=) (k = ko)t (3.124) 
ko 


The summation of terms in the exponent given by 3.124 leads to the amplitude 3.122 having the form of a 
product where the integral becomes 


: ko+ Ak ik r (de 
qlz, t) = etort) / Alke BV (olak (3.125) 
ko— Ak 
The integral term modulates the e’(*0*—“o) first term. 
The group velocity is defined to be that for which the phase of the exponential term in the integral is 
constant. Thus 


dw 
Ugroup = a. (3.126) 
Since w = kUphase then 
OVphase 
Ugroup = Uphase +k a (3.127) 


For non-dispersive systems the phase velocity is independent of the wave number k or angular frequency w 
and thus Vgroup = Uphase- The case discussed earlier, equation (3.103) , for beating of two waves gives the 
same relation in the limit that Aw and Ak are infinitessimal. 

The group velocity of a wave packet is of physical significance for dispersive media where Uyroup = 
($2) eo # $ = Uphase: Every wave train has a finite extent and thus we usually observe the motion of a 
group of waves rather than the wavelets moving within the wave packet. In general, for non-linear dispersive 
systems the derivative phase can be either positive or negative and thus in principle the group velocity 
can either be greater than, or less than, the phase velocity. Moreover, if the group velocity is frequency 
dependent, that is, when group velocity dispersion occurs, then the overall shape of the wave packet is time 
dependent and thus the speed of a specific relative location defined by the shape of the envelope of the wave 
packet does not represent the signal velocity of the wave packet. Brillouin showed that the distribution 
of the energy, and corresponding information content, for any wave packet, travels at the signal velocity 
which can be different from the group velocity if the shape of the envelope of the wave packet is time 
dependent. For electromagnetic waves one has the possibility that the group velocity Ugroup > Uphase = C. In 
1914 Brillouin[Bri14][Bri60] showed that the signal velocity of electromagnetic waves, defined by the leading 
edge of the time-dependent envelope of the wave packet, never exceeds c even though the group velocity 
corresponding to the velocity of the instantaneous shape of the wave packet may exceed c. Thus, there is 
no violation of Einstein’s fundamental principle of relativity that the velocity of an electromagnetic wave 
cannot exceed c. 
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3.3 Example: Water waves breaking on a beach 


The concepts of phase and group velocity are illustrated by the example of water waves moving at velocity 
v incident upon a straight beach at an angle a to the shoreline. Consider that the wavepacket comprises 
many wavelengths of wavelength A. During the time it takes the wave to travel a distance A, the point where 
the crest of one wave breaks on the beach travels a distance == along beach. Thus the phase velocity of the 
crest of the one wavelet in the wave packet is 
v 


Uphase = 
COS Q 


The velocity of the wave packet along the beach equals 
Ugroup = V COS Œ 


Note that for the wave moving parallel to the beach a = 0 and Uphase = Vgroup = V- However, for a = 5 
Uphase — CO and Vgroup — 0. In general for waves breaking on the beach 


UphaseVgroup = y? 
The same behavior is exhibited by surface waves bouncing off the sides of the Erie canal, sound waves in 
a trombone, and electromagnetic waves transmitted down a rectangular wave guide. In the latter case the 
phase velocity exceeds the velocity of light c in apparent violation of Einstein’s theory of relativity. However, 
the information travels at the signal velocity which is less than c. 


3.4 Example: Surface waves for deep water 


In the “Theory of Sound” [Ray1887] Rayleigh discusses the example of surface waves for water. He derives 
a dispersion relation for the phase velocity Uphase and wavenumber k which are related to the density p, depth 
l, gravity g, and surface tension T, by 


Tk 
w? = gk + — tanh(kl) 
p 


For deep water where the wavelength is short compared with the depth, that is kl >> 1, then tanh(kl) > 1 
and the dispersion relation is given approximately by 


Tk 

2 

a = gk + — 
19) 


For long surface waves for deep water, that is, small k, then the gravitational first term in the dispersion 
relation dominates and the group velocity is given by 


= dw 1 /g lw Uphase 
eS Np ONE Bk 2 


That is, the group velocity is half of the phase velocity. Here the wavelets are building at the back of the wave 
packet, progress through the wave packet and dissipate at the front. This can be demonstrated by dropping a 
pebble into a calm lake. It will be seen that the surface disturbance comprises a wave packet moving outwards 
at the group velocity with the individual waves within the wave packet expanding at twice the group velocity 
of the wavepacket, that is, they are created at the inner radius of the wave packet and disappear at the outer 
radius of the wave packet. 

For small wavelength ripples, where k is large, then the surface tension term dominates and the dispersion 
relation is approximately given by 


ae 
p 


dw 3 
Ugroup = dE = 3 phase 


Here the group velocity exceeds the phase velocity and wavelets are building at the front of the wave packet and 
dissipate at the back. Note that for this linear system, the Brillion signal velocity equals the group velocity 
for both gravity and surface tension waves for deep water. 


W 


leading to a group velocity of 
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3.5 Example: Electromagnetic waves in tonosphere 


The response to radio waves, incident upon a free electron plasma in the ionosphere, provides an excellent 
example that involves cut-off frequency, complex wavenumber k, as well as the phase, group, and signal 
velocities. Mazwell’s equations give the most general wave equation for electromagnetic waves to be 


OPE Oj p 
2p — = free ; free 
Mela Par aa € ) 
2 
V7H HE YD vV X Jfree 


where Pfree ANA jfree are the unbound charge and current densities. The effect of the bound charges and 
currents are absorbed into e and u. Ohm’s Law can be written in terms of the electrical conductivity o which 
is a constant 

j=0E 


Assuming Ohm’s Law plus assuming P free =O, in the plasma gives the relations 


E OE 
VE — ep — = 
Pa Pa : 
0H dH 
V?H — es E 
Hap OP OE i 


The third term in both of these wave equations is a damping term that leads to a damped solution of an 
electromagnetic wave in a good conductor. 
The solution of these damped wave equations can be solved by considering an incident wave 


E= E,ke wt=hz) 
Substituting for E in the first damped wave equation gives 
—k? + wep — iwop = 0 


That is 


In general k is complex, that is, it has real kr and imaginary kr parts that lead to a solution of the form 
E= E,e 8% etwt—krz) 


The first exponential term is an exponential damping term while the second exponential term is the oscillating 
term. 

Consider that the plasma involves the motion of a bound damped electron, of charge q of mass m, bound 
in a one dimensional atom or lattice subject to an oscillatory electric field of frequency w. Assume that the 
electromagnetic wave is travelling in the 2 direction with the transverse electric field in the & direction. The 


equation of motion of an electron can be written as 
X +r + we = Rq Epe tka) 


where T is the damping factor. The instantaneous displacement of the oscillating charge equals 


q 1 s i(wt—kz) 
x = ——— x ve 
m (w2 =w?) + iw ° 
and the velocity is 
x= q ww Perks) 


m (w3 — w?) + ilw 


Thus the instantaneous current density is given by 
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Therefore the electrical conductivity is given by 
Ng? iw 
m (we —w?)+ Tw 


o = 


Let us consider only unbound charges in the plasma, that is let wo = 0. Then the conductivity is given by 


Ng? iw 
o = nD 
m ilw-w 


For a low density ionized plasma w >>T thus the conductivity is given approximately by 


Since o is pure imaginary, then j and E have a phase difference of E which implies that the average of 
the Joule heating over a complete period is (j-E) = 0. Thus there is no energy loss due to Joule heating 
implying that the electromagnetic energy is conserved. 

Substitution of o into the relation for k? 


k? = wep h — =| = wep i — 
WE 


Define the Plasma oscillation frequency wp to be 


then k? can be written as ; 
k? = wep i = (2) ] (a) 


For a low density plasma the dielectric constant kg ~ 1 and the relative permeability kg = 1 and thus 
E = KpE Y E0 and p = Kio = Ho. The velocity of light in vacuum c = FE: Thus for low density 
equation a can be written as 


w? =w} + 07k? (8) 


Differentiation of equation B with respect to k gives 2w% =2c?k. That is, UphaseUgroup = C? and the phase 
velocity is 


2 
w 
Uphase = + E 
There are three cases to consider. a 
1)w>wp: For this case |1— (22) | > l and thus k is a pure real number. Therefore the elec- 


tromagnetic wave is transmitted with a phase velocity that exceeds c while the group velocity is less than 
C. 
2)w<wp: For this case E — (22)’| <1 and thus k is a pure imaginary number. Therefore the 


electromagnetic wave is not transmitted in the ionosphere and is attenuated rapidly as cr, However, 
since there are no Joule heating losses, then the electromagnetic wave must be complete reflected. Thus the 
Plasma oscillation frequency serves as a cut-off frequency. For this example the signal and group velocities 
are identical. 

For the ionosphere N = 105 electrons/m*, which corresponds to a Plasma oscillation frequency of 
v=wp/2n =3MHz. Thus electromagnetic waves in the AM waveband (< 1.6M Hz) are totally reflected by 
the ionosphere and bounce repeatedly around the Earth, whereas for VHF frequencies above 3M Hz, the waves 
are transmitted and refracted passing through the atmosphere. Thus light is transmitted by the ionosphere. 
By contrast, for a good conductor like silver, the Plasma oscillation frequency is around 10'*Hz which is 
in the far ultraviolet part of the spectrum. Thus, all lower frequencies, such as light, are totally reflected 
by such a good conductor, whereas X-rays have frequencies above the Plasma oscillation frequency and are 
transmitted. 
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3.11.2 Fourier transform of wave packets 


The relation between the time distribution and the cor- 

responding frequency distribution, or equivalently, the 

spatial distribution and the corresponding wave-number 197 G(o) AN 
distribution, are of considerable importance in discus- osf PA 
sion of wave packets and signal processing. It directly 08 T | \ 
relates to the uncertainty principle that is a characteris- 0.77 | \ 
tic of all forms of wave motion. The relation between the weg | | 
time and corresponding frequency distribution is given “T | | 
via the Fourier transform discussed in appendix I. The call | \ 
following are two examples of the Fourier transforms of oat J 

typical but rather different wavepacket shapes that are ae / \ 


. . r $ DIF 
encountered often in science and engineering. 25 REAA wie EE Xe 2 


. sr f(t) 
3.6 Example: Fourier transform of a 
Gaussian wave packet: ) +} 


| | | T 
Assuming that the amplitude of the wave is a N || | Jij | 
Gaussian wave packet shown in the adjacent figure where 


i 
‘ 


4 
This leads to the Fourier transform i 


0212 


f (t) = cV 210,0 T cos (wot) Fourier transform of a Gaussian frequency 


distribution. 

Note that the wavepacket has a standard deviation for the amplitude of the wavepacket of o, = = , that 
is Ot: Owu = 1. The Gaussian wavepacket results in the minimum product of the standard deviations of the 
frequency and time representations for a wavepacket. This has profound importance for all wave phenomena, 
and especially to quantum mechanics. Because matter exhibits wave-like behavior, the above property of wave 


packet leads to Heisenberg’s Uncertainty Principle. For signal processing, it shows that if you truncate a 
wavepacket you will broaden the frequency distribution. 


3.7 Example: Fourier transform of a rectangular wave packet: 


Assume unity amplitude of the frequency distribution between wo — Aw < w < wo + Aw , that is, a single 
isolated square pulse of width T that is described by the rectangular function II defined as 


fil lw — wo| < Aw 
Mu) = { 0 lv — wo| > Aw 


Then the Fourier transform us given by 


sin Awt 
t) = | ——— | coswot 
rO = |S cose 
That is, the transform of a rectangular wavepacket gives a cosine wave modulated by an unnormalized 
sinc function which is a nice example of a simple wave packet. That is, on the right hand side we have 
a wavepacket At = +2" wide. Note that the product of the two measures of the widths Aw - At = +r. 


T Aw 


Example 1.2 considers a rectangular pulse of unity amplitude between —5 < t < 5 which resulted in a 


Fourier transform G (w) = T ar . That is, for a pulse of width At = +3 the frequency envelope has 
7 2 


the first zero at Aw =+*%. Note that this is the complementary system to the one considered here which has 
Aw. At = +7 illustrating the symmetry of the Fourier transform and its inverse. 
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3.11.3 Wave-packet Uncertainty Principle 


The Uncertainty Principle states that wavemotion exhibits a minimum product of the uncertainty in the 
simultaneously measured width in time of a wave packet, and the distribution width of the frequency de- 
composition of this wave packet. This was illustrated by the Fourier transforms of wave packets discussed 
above where it was shown the product of the widths is minimized for a Gaussian-shaped wave packet. The 
Uncertainty Principle implies that to make a precise measurement of the frequency of a sinusoidal wave 
requires that the wave packet be infinitely long. If the duration of the wave packet is reduced then the 
frequency distribution broadens. The crucial aspect needed for this discussion, is that, for the amplitudes 


of any wavepacket, the standard deviations o (t) = 1/(t2) — (t)? characterizing the width of the spectral 
distribution in the angular frequency domain, o 4(w), and the width for the conjugate variable in time ø 4(t) 
are related : 

aa(t)-oa(w) > 1 (Relation between amplitude uncertainties. ) 


This product of the standard deviations equals unity only for the special case of Gaussian-shaped spectral 
distributions, and it is greater than unity for all other shaped spectral distributions. 

The intensity of the wave is the square of the amplitude leading to standard deviation widths for a 
Gaussian distribution where o7(t)? = 30 4(t)?, that is, o7(t) = 74 Thus the standard deviations for the 


V2 
spectral distribution and width of the intensity of the wavepacket are related by: 
1 
or(t)-or(w) > 5 (Uncertainty principle for frequency-time intensities) 


This states that the uncertainties with which you can simultaneously measure the time and frequency 
for the intensity of a given wavepacket are related. If you try to measure the frequency within a short time 
interval o;(t) then the uncertainty in the frequency measurement o;(w) > RO Accurate measurement 
of the frequency requires measurement times that encompass many cycles of oscillation, that is, a long 
wavepacket. 

Exactly the same relations exist between the spectral distribution as a function of wavenumber k, and 
the corresponding spatial dependence of a wave x which are conjugate representations. Thus the spectral 
distribution plotted versus k, is directly related to the amplitude as a function of position x; the spectral 
distribution versus k, is related to the amplitude as a function of y; and the k, spectral distribution is related 
to the spatial dependence on z. Following the same arguments discussed above, the standard deviation, 
or(k,) characterizing the width of the spectral intensity distribution of ky, and the standard deviation 
o1(x), characterizing the spatial width of the wave packet intensity as a function of x, are related by the 
Uncertainty Principle for position-wavenumber. Thus in summary the temporal and spatial uncertainty 
principles of the intensity of wave motion is, 


edo (3.128) 
or(@)-or(ke) > ; orly): orlky) > 5 or(z)-or(kz) > 5 


This applies to all forms of wave motion, be they, sound waves, water waves, electromagnetic waves, or 
matter waves. 

As discussed in chapter 18, the transition to quantum mechanics involves relating the matter-wave prop- 
erties to the energy and momentum of the corresponding particle. That is, in the case of matter waves, 
multiplying both sides of equation 3.129 by A and using the de Broglie relations gives that the particle en- 
ergy is related to the angular frequency by E = hw and the particle momentum is related to the wavenumber, 
that is p = nk. These lead to the Heisenberg Uncertainty Principle: 


o1(t)-o7(E) > (3.129) 


h h h 
o1(t) or(pa) > 5 arly): o1(py) > 5 o1(2)-o1(Pz) 2 5 
This uncertainty principle applies equally to the wavefunction of the electron in the 


hydrogen atom, proton in a nucleus, as well as to a wavepacket describing a particle wave moving along some 
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trajectory. This implies that, for a particle of given momentum, the wavefunction is spread out spatially. 
Planck’s constant h = 1.0541079*.J . s = 6.582107 !%eV - s is extremely small compared with energies and 
times encountered in normal life, and thus the effects due to the Uncertainty Principle are not important for 
macroscopic dimensions. 

Confinement of a particle, of mass m, within +o(z) of a fixed location implies that there is a corresponding 
uncertainty in the momentum 


o(Pa) > (3.130) 


20(x) 
Now the variance in momentum p is given by the difference in the average of the square ((p : p)), and the 


square of the average of (p)*. That is 


o(p)? =((p-py) = (py? (3.131) 


Assuming a fixed average location implies that (p) = 0, then 


(e p)’) = 0(p)” > os) (3.132) 


Since the kinetic energy is given by: 


2 h2 


Kinetic energy = Im > 8mo(ry? 
m MO 


(Zero-point energy) 


This zero-point energy is the minimum kinetic energy that a particle of mass m can have if confined within a 
distance to(r). This zero-point energy is a consequence of wave-particle duality and the uncertainty between 
the size and wavenumber for any wave packet. It is a quantal effect in that the classical limit has A — 0 for 
which the zero-point energy — 0. 

Inserting numbers for the zero-point energy gives that an electron confined to the radius of the atom, 
that is a(x) = 10~!°m, has a zero-point kinetic energy of ~ 1eV. Confining this electron to 3 x 10715m, the 
size of a nucleus, gives a zero-point energy of 10%eV (1GeV). Confining a proton to the size of the nucleus 
gives a zero-point energy of 0.5MeV. These values are typical of the level spacing observed in atomic and 
nuclear physics. If h was a large number, then a billiard ball confined to a billiard table would be a blur 
as it oscillated with the minimum zero-point kinetic energy. The smaller the spatial region that the ball 
was confined, the larger would be its zero-point energy and momentum causing it to rattle back and forth 
between the boundaries of the confined region. Life would be dramatically different if h was a large number. 

In summary, Heisenberg’s Uncertainty Principle is a well-known and crucially important aspect of quan- 
tum physics. What is less well known, is that the Uncertainty Principle applies for all forms of wave motion, 
that is, it is not restricted to matter waves. The following three examples illustrate application of the 
Uncertainty Principle to acoustics, the nuclear Méssbauer effect, and quantum mechanics. 


3.8 Example: Acoustic wave packet 


A violinist plays the note middle C (261.625Hz) with constant intensity for precisely 2 seconds. Using 
the fact that the velocity of sound in air is 343.2m/s calculate the following: 

1) The wavelength of the sound wave in air: A = 343.2/261.625 = 1.312m. 

2) The length of the wavepacket in air: Wavepacket length = 343.2 x 2 = 686.4m 

3) The fractional frequency width of the note: Since the wave packet has a square pulse shape of length 
T = 2s, then the Fourier transform is a sinc function having the first zeros when sin 3 = 0, that is, Av = +, 
Therefore the fractional width is av = + = 0.0019. Note that to achieve a purity of av = 107° the violinist 
would have to play the note for 1.06hours. 


3.9 Example: Gravitational red shift 


The Méssbauer effect in nuclear physics provides a wave packet that has an exceptionally small fractional 


width in frequency. For example, the 57 Fe nucleus emits a 14.4keV deexcitation-energy photon which corre- 


sponds to w ~ 2 x 10Brad/s with a decay time of T ~10~7s. Thus the fractional width is 22 ~ 3 x 10713, 
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In 1959 Pound and Rebka used this to test Einstein’s general theory of relativity by measurement of the 
gravitational red shift between the attic and basement of the 22.5m high physics building at Harvard. The 
magnitude of the predicted relativistic red shift is AF = 2.5 x 10 15 which is what was observed with a 


fractional precision of about 1%. 


3.10 Example: Quantum baseball 


George Gamow, in his book ”Mr. Tompkins in Wonderland”, describes the strange world that would exist 
if h was a large number. As an example, consider you play baseball in a universe where h is a large number. 
The pitcher throws a 150g ball 20m to the batter at a speed of 40m/s. For a strike to be thrown, the ball’s 
position must be pitched within the 30cm radius of the strike zone, that is, it is required that Ax < 0.3m. 
The uncertainty relation tells us that the transverse velocity of the ball cannot be less than Av = st The 
time of flight of the ball from the mound to batter is t = 0.5s. Because of the transverse velocity uncertainty, 
Av, the ball will deviate tAv transversely from the strike zone. This also must not exceed the size of the 


strike zone, that is; 


tAv = 


< 0.3m (Due to transverse velocity uncertainty) 
MAT 


Combining both of these requirements gives 


2mAx? 


h< = 5.4 107?J - s. 


This is 32 orders of magnitude larger than ħ so quantal effects are negligible. However, if h exceeded the 
above value, then the pitcher would have difficulty throwing a reliable strike. 


3.12 Summary 


Linear systems have the feature that the solutions obey the Principle of Superposition, that is, the am- 
plitudes add linearly for the superposition of different oscillatory modes. Applicability of the Principle of 
Superposition to a system provides a tremendous advantage for handling and solving the equations of motion 
of oscillatory systems. 

Geometric representations of the motion of dynamical systems provide sensitive probes of periodic mo- 
tion. Configuration space (q,q,t), state space (q,4,t) and phase space (q,p,t), are powerful geometric 
representations that are used extensively for recognizing periodic motion where q,q, and p are vectors in 
n-dimensional space. 


Linearly-damped free linear oscillator The free linearly-damped linear oscillator is characterized by 
the equation 
#+Te+wer =0 (3.26) 


The solutions of the linearly-damped free linear oscillator are of the form 


rE 
2 


2 
zg=e (3) [ziet + ze") wy = 4\/ w? — (5) (3.33) 


The solutions of the linearly-damped free linear oscillator have the following characteristic frequencies cor- 
responding to the three levels of linear damping 


underdamped 


x(t) = [Aye “+! + Age“-*] | overdamped 


a(t) = (A+ Bt) e (a)t critically damped 
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The energy dissipation for the linearly-damped free linear oscillator time averaged over one period is 
given by 


(E) = Eye ** (3.44) 
The quality factor Q characterizing the damping of the free oscillator is defined to be 
E Wy 
Se N a 3.47 
S IT (3.47) 


where AE is the energy dissipated per radian. 


Sinusoidally-driven, linearly-damped, linear oscillator The linearly-damped linear oscillator, driven 
by a harmonic driving force, is of considerable importance to all branches of physics, and engineering. The 
equation of motion can be written as 
F(t 
#+Tt+wer = HM) (3.49) 
m 
where F(t) is the driving force. The complete solution of this second-order differential equation comprises 
two components, the complementary solution (transient response), and the particular solution (steady-state 
response). That is, 


x(t) Total = x(t)r + x(t) s (3.65) 
For the underdamped case, the transient solution is the complementary solution 
F 
a(t)r = = cos (wit — ô) (3.66) 


and the steady-state solution is given by the particular solution 
Fo 
x(t) s = —— Cos (wt — ô) (3.67) 
(03 — w?) + Tw)? 


Resonance A detailed discussion of resonance and energy absorption for the driven linearly-damped linear 
oscillator was given. For resonance of the linearly-damped linear oscillator the maximum amplitudes occur 
at the following resonant frequencies 


Resonant system Resonant frequency 


= F 
Wo = 4 m 


undamped free linear oscillator 


linearly-damped free linear oscillator 


| driven linearly-damped linear oscillator 


The energy absorption for the steady-state solution for resonance is given by 
u(t)g = Ae cos wt + Agys sin wt (3.73) 


where the elastic amplitude 


Fo 
m w= 3.74 
(w3 — w2)? + (Tw)? (a ) C 


while the absorptive amplitude 
Fo 
Aabs = Gow) + foe (3.75) 
The time average power input is given by only the absorptive term 
Fe TW? 
2m (w§ — 0) + (Two) 


(P) = T Fou Aane = (3.133) 


This power curve has the classic Lorentzian shape. 
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Wave propagation The wave equation was introduced and both travelling and standing wave solutions 
of the wave equation were discussed. Harmonic wave-form analysis, and the complementary time-sampled 
wave form analysis techniques, were introduced in this chapter and in appendix J. The relative merits of 
Fourier analysis and the digital Green’s function waveform analysis were illustrated for signal processing. 

The concepts of phase velocity, group velocity, and signal velocity were introduced. The phase velocity 
is given by 


Uphase = = (3.117) 


and group velocity 


dw Ov h 

Ugroup = (=) = Uphase + k A (3.128) 

If the group velocity is frequency dependent then the information content of a wave packet travels at the 
signal velocity which can differ from the group velocity. 

The Wave-packet Uncertainty Principle implies that making a precise measurement of the frequency of a 


sinusoidal wave requires that the wave packet be infinitely long. The standard deviation o (t) = 4/ (t2) — (ty 
characterizing the width of the amplitude of the wavepacket spectral distribution in the angular frequency 
domain, 7 4(w), and the corresponding width in time o4(t), are related by : 


aa(t)-oa(w) > 1 (Relation between amplitude uncertainties. ) 


The standard deviations for the spectral distribution and width of the intensity of the wave packet are 
related by: 


or(t)-or(w) > ; (3.134) 
orle) ork) >E a) olk) or) ork) > 5 


This applies to all forms of wave motion, including sound waves, water waves, electromagnetic waves, or 
matter waves. 
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Workshop exercises 


1. Given below are a list of statements followed by a list of reasons related to harmonic motion. For each of the 
statements, determine the reason(s) that make that statement true. You may do this in small groups or as one 
large group-the teaching assistant will decide what works best for your workshop. 

Statements: 


e We can neglect the higher order terms in the Taylor expansion of F(x). 
e The restoring force is a linear force. 
e Fo must vanish. 


e (1F/dx)o is negative and k is positive. 


We can write F(x) as a Taylor series expansion. 
Reasons: 


e F(x) depends only on x. 

e A position of stable equilibrium exists and we call this point the origin of our coordinate system. 
e F(x) has continuous derivatives of all orders. 

e The restoring force is directed toward the equilibrium position. 


e We consider only small displacements. 
2. Second-order ordinary differential equations are an important part of the physics of the harmonic oscillator. 


(a) What do each of the following terms mean with respect to differential equations? 


1. Ordinary 
li. Second-order 
iii. Homogeneous 
iv. Linear 
(b) Give a mini-lesson on how to solve second-order differential equations by working through the following 
examples. Don’t just provide a solution; explain the steps leading up to the solution. 
i. y +5y'+6y = 0 
ii. y’+y'+y = 0 
iii. y"+4y'+4y = 0 


12x 


= 


iv. y”—3y 

v. y"—3y'—4y = 2sinx 
3. Harmonic oscillations occur for many different types of systems and it is important to recognize when the 
equations for harmonic motion apply. Three different systems are described below. Each system can be 


approximately described using the equations for harmonic motion. Break up into three groups-one group per 
system. For your group’s system, answer the following questions: 


(a) What approximations are necessary for this system to exhibit harmonic oscillations? 


(b) What is the differential equation that governs the motion of this system? Use Newton’s second law to 
arrive at this equation. 


(c) What is the solution to the differential equation that you found in part (b)? 
(d) What is the natural frequency of oscillations? 


Here are the three systems: 


e A mass m is tied to a massless spring having a spring constant k. The system oscillates in one dimension 
along a horizontal frictionless surface. 
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e A particle of mass m is attached to a weightless, extensionless rod to form a pendulum. The length of 
the rod is L and the system oscillates in a single plane. 


e A tube is bent into the shape of a U and is partially filled with a liquid of density p. The cross-sectional 
area of the tube is A and the length of the tube filled with liquid is L. The liquid is initially displaced so 
that it is higher on one side of the tube than the other. 


Once each group has answered all of the questions, share the results with the entire class. 


4. Consider a mass m attached to a spring of spring constant k. The spring is mounted horizontally so that the 
mass oscillates horizontally on a frictionless surface. The spring is attached to the wall on the right and the 
mass is initially moved to the right of its equilibrium position (compressing the spring) by a distance s and 
released. Working individually, determine how (if at all) the period of the motion would be affected by each of 
the changes below. Once you have answered each part on your own, compare your answers with a classmate. 


The spring is replaced with a stiffer spring. 


The mass is initially displaced a distance s to the left and released. 


5. When you were first introduced to simple harmonic motion, you used the formula mí = —kzx to find the 
position of the oscillating mass as a function of time. This assumes that the origin is defined to be the 
equilibrium point. What happens if this is not the case? What would the equation of motion look like? How 
would the position of the oscillating mass as a function of time change? 


6. For each of the situations described below, give a rough sketch of the state space diagram (Y versus x) that 
represents the motion of each object. All of the motion takes place along the x-axis. 


(a) An eggplant is at rest at a point on the +x axis. 
(b) A monkey on a skateboard skates with constant speed in the negative x direction. 
(c) A race car moving in the +2 direction undergoes constant acceleration until it abruptly stops. 


(d) A cantaloupe undergoes simple harmonic motion. The initial location of the cantaloupe is at a point on 
the +2 axis. 


7. Consider a simple harmonic oscillator consisting of a mass m attached to a spring of spring constant k. For 
this oscillator z(t) = Asin(wot — ô). 


Find an expression for i(t). 


b) Eliminate t between x(t) and &(t) to arrive at one equation similar to that for an ellipse. 
) Rewrite the equation in part (b) in terms of x, ¢, k, m, and the total energy E. 


Give a rough sketch of the phase space diagram (2 versus x) for this oscillator. Also, on the same set of 
axes, sketch the phase space diagram for a similar oscillator with a total energy that is larger than the 
first oscillator. 


(e) What direction are the paths that you have sketched? Explain your answer. 


(£) Would different trajectories for the same oscillator ever cross paths? Why or why not? 
8. Consider a damped, driven oscillator consisting of a mass m attached to a spring of spring constant k. 


(a) What is the equation of motion for this system? 


(b) Solve the equation in part (a). The solution consists of two parts, the complementary solution and the 
particular solution. When might it be possible to safely neglect one part of the solution? 


(c) What is the difference between amplitude resonance and kinetic energy resonance? 


(d) How might phase space diagrams look for this type of oscillator? What variables would affect the diagram? 
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9. A particle of mass m is subject to the following force 
F = A(x? — 42? + 3x)& 
where A is a constant. 


(a) Determine the points when the particle is in equilibrium. 
(b) Which of these points is stable and which are unstable? 


(c) Is the motion bounded or unbounded? 


10. A very long cylindrical shell has a mass density that depends upon the radial distance such that p(r) = E 


> 


where k is a constant. The inner radius of the shell is a and the outer radius is b. 


(a) Determine the direction and the magnitude of the gravitational field for all regions of space. 


(b) If the gravitational potential is zero at the origin, what is the difference between the gravitational potential 
at r = b and r = a? 


11. A mass m is constrained to move along one dimension. Two identical springs are attached to the mass, one on 
each side, and each spring is in turn attached to a wall. Both springs have the same spring constant k. 


(a) Determine the frequency of the oscillation, assuming no damping. 


(b) Now consider damping. It is observed that after n oscillations, the amplitude of the oscillation has 
dropped to one-half of its initial value. Find an expression for the damping constant. 


(c) How long does it take for the amplitude to decrease to one-quarter of its initial value? 


12. Discuss the motion of a continuous string when plucked at one third of the length of the string. That is, the 
34 0O<<¿ 
Es == 3 \ 


initial condition is q(x,0) = 0, and q(x, 0) = { 3A A Bees 
bj 3 = = 


2L 
13. When a particular driving force is applied to a stretched string it is observed that the string vibration in purely 


of the nt” harmonic. Find the driving force. 


14. Consider the two-mass system pivoted at its vertex where M # m. It undergoes oscillations of the angle 0 
with respect to the vertical in the plane of the triangle. 


(a) Determine the angular frequency of small oscillations. 

(b) Use your result from part (a) to show w? = 2 for M >m. 
— U” (Ge) 
z I 


(c) Show that your result from part (a) agrees with w? where 0. is the equilibrium angle and I is 


the moment of inertia. 
(d) Assume the system has energy E. Setup an integral that determines the period of oscillation. 


15. A cube of side a and mass ™ is immersed in water with density p past the point of equilibrium and then 
released. Assume there is no damping due to the water. 


(a) Show that the cube’s equation of motion is 


dex 


where A and B are constants. Determine A and B. 
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(b) The solution to the equation of motion is 
t mW cos (V At) + C, sin (V At 
x A 2 
where Cy and C2 are constants. If x(0) = —a, determine x(t). 
(c) Determine the period T of oscillation. 
Problems 

1. An unusual pendulum is made by fixing a string to a horizontal cylinder of radius R, wrapping the string 
several times around the cylinder, and then tying a mass m to the loose end. In equilibrium the mass hangs a 
distance lg vertically below the edge of the cylinder. Find the potential energy if the pendulum has swung to 
an angle ¢ from the vertical. Show that for small angles, it can be written in the Hooke’s Law form U = ikg’. 
Comment of the value of k. 

2. Consider the two-dimensional anisotropic oscillator with motion with wz = pw and Wy = qu. 

a) Prove that if the ratio of the frequencies is rational (that is, = A where p and q are integers) then the 
y 

motion is periodic. What is the period? 

b) Prove that if the same ratio is irrational, the motion never repeats itself. 

3. A simple pendulum consists of a mass m suspended from a fixed point by a weight-less, extensionless rod of 
length J. 

a) Obtain the equation of motion, and in the approximation sin? = 0, show that the natural frequency is 
wo = Jt , where g is the gravitational field strength. 

b) Discuss the motion in the event that the motion takes place in a viscous medium with retarding force 
2mygl0. 

4. Derive the expression for the State Space paths of the plane pendulum if the total energy is E > 2mgl. Note 
that this is just the case of a particle moving in a periodic potential U(0) = mgl(1—cos0). Sketch the State 
Space diagram for both E > 2mgl and E < 2mgl. 

5. Consider the motion of a driven linearly-damped harmonic oscillator after the transient solution has died out, 
and suppose that it is being driven close to resonance, w = Wo. 

a) Show that the oscillator’s total energy is E = smu? A?, 
b) Show that the energy AE4g;, dissipated during one cycle by the damping force Dé is TT?’ mwA? 

6. Two masses m; and ma slide freely on a horizontal frictionless rail and are connected by a spring whose force 
constant is k. Find the frequency of oscillatory motion for this system. 

7. A particle of mass m moves under the influence of a resistive force proportional to velocity and a potential U, 


that is l. au 
F(a,%) = —be — — 
(x, £) a 
where b > 0 and U(x) = (£? — a?)? 
a) Find the points of stable and unstable equilibrium. 
b) Find the solution of the equations of motion for small oscillations around the stable equilibrium points 


c) Show that as t — 00 the particle approaches one of the stable equilibrium points for most choices of initial 
conditions. What are the exceptions? (Hint: You can prove this without finding the solutions explicitly.) 


Chapter 4 


Nonlinear systems and chaos 


4.1 Introduction 


In nature only a subset of systems have equations of motion that are linear. Contrary to the impression 
given by the analytic solutions presented in undergraduate physics courses, most dynamical systems in 
nature exhibit non-linear behavior that leads to complicated motion. The solutions of non-linear equations 
usually do not have analytic solutions, superposition does not apply, and they predict phenomena such as 
attractors, discontinuous period bifurcation, extreme sensitivity to initial conditions, rolling motion, and 
chaos. During the past four decades, exciting discoveries have been made in classical mechanics that are 
associated with the recognition that nonlinear systems can exhibit chaos. Chaotic phenomena have been 
observed in most fields of science and engineering such as, weather patterns, fluid flow, motion of planets in 
the solar system, epidemics, changing populations of animals, birds and insects, and the motion of electrons 
in atoms. The complicated dynamical behavior predicted by non-linear differential equations is not limited 
to classical mechanics, rather it is a manifestation of the mathematical properties of the solutions of the 
differential equations involved, and thus is generally applicable to solutions of first or second-order non- 
linear differential equations. It is important to understand that the systems discussed in this chapter follow 
a fully deterministic evolution predicted by the laws of classical mechanics, the evolution for which is based 
on the prior history. This behavior is completely different from a random walk where each step is based on a 
random process. The complicated motion of deterministic non-linear systems stems in part from sensitivity 
to the initial conditions. 

The French mathematician Poincaré is credited with being the first to recognize the existence of chaos 
during his investigation of the gravitational three-body problem in celestial mechanics. At the end of the 
nineteenth century Poincaré noticed that such systems exhibit high sensitivity to initial conditions character- 
istic of chaotic motion, and the existence of nonlinearity which is required to produce chaos. Poincaré's work 
received little notice, in part it was overshadowed by the parallel development of the Theory of Relativity 
and quantum mechanics at the start of the 20*” century. In addition, solving nonlinear equations of motion 
is difficult, which discouraged work on nonlinear mechanics and chaotic motion. The field blossomed during 
the 1960's when computers became sufficiently powerful to solve the nonlinear equations required to calculate 
the long-time histories necessary to document the evolution of chaotic behavior. Laplace, and many other 
scientists, believed in the deterministic view of nature which assumes that if the position and velocities of 
all particles are known, then one can unambiguously predict the future motion using Newtonian mechanics. 
Researchers in many fields of science now realize that this “clockwork universe” is invalid. That is, knowing 
the laws of nature can be insufficient to predict the evolution of nonlinear systems in that the time evolu- 
tion can be extremely sensitive to the initial conditions even though they follow a completely deterministic 
development. There are two major classifications of nonlinear systems that lead to chaos in nature. The 
first classification encompasses nondissipative Hamiltonian systems such as Poincaré's three-body celestial 
mechanics system. The other main classification involves driven, damped, non-linear oscillatory systems. 

Nonlinearity and chaos is a broad and active field and thus this chapter will focus only on a few examples 
that illustrate the general features of non-linear systems. Weak non-linearity is used to illustrate bifurcation 
and asymptotic attractor solutions for which the system evolves independent of the initial conditions. The 
common sinusoidally-driven linearly-damped plane pendulum illustrates several features characteristic of the 
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evolution of a non-linear system from order to chaos. The impact of non-linearity on wavepacket propagation 
velocities and the existence of soliton solutions is discussed. The example of the three-body problem is 
discussed in chapter 11. The transition from laminar flow to turbulent flow is illustrated by fluid mechanics 
discussed in chapter 16.8. Analytic solutions of nonlinear systems usually are not available and thus one 
must resort to computer simulations. As a consequence the present discussion focusses on the main features 
of the solutions for these systems and ignores how the equations of motion are solved. 


4.2 Weak nonlinearity 


Most physical oscillators become non-linear with increase in amplitude of the oscillations. Consequences 
of non-linearity include breakdown of superposition, introduction of additional harmonics, and complicated 
chaotic motion that has great sensitivity to the initial conditions as illustrated in this chapter. Weak non- 
linearity is interesting since perturbation theory can be used to solve the non-linear equations of motion. 

The potential energy function for a linear oscillator has a pure parabolic shape about the minimum 
location, that is, U = tk(a — 20)? where xp is the location of the minimum. Weak non-linear systems have 
small amplitude oscillations Ax about the minimum allowing use of the Taylor expansion 


dU (zo) le Ax? dU (x0) AL Az? dU (xp) f Axt d*U (xo) y 


Ax) = A - 4.1 

AE ET 2 de? 3l des a det on) 
By definition, at the minimum Woo) = 0, and thus equation 4.1 can be written as 
Ax? q? Ax? dU Ax? d*U 

O er a O RD E Na) 5, (4.2) 


2! dx? 3! dx? 4! dx* 
For small amplitude oscillations the system is linear when only the second-order az ai term in equation 
4.2 is significant. The linearity for small amplitude oscillations greatly simplifies description of the oscillatory 
motion in that superposition applies, and complicated chaotic motion is avoided. For slightly larger amplitude 
motion, where the higher-order terms in the expansion are still much smaller than the second-order term, 
then perturbation theory can be used as illustrated by the simple plane pendulum which is non linear since 
the restoring force equals 

g? g? gr 
mgsind =mg(0— +5 = 7 ++“) (4.3) 
This is linear only at very small angles where the higher-order terms in the expansion can be neglected. 
Consider the equation of motion at small amplitudes for the harmonically-driven, linearly-damped plane 


pendulum 
0+T0+w¿sin0 =0+T0 + w2(0 — | 
where only the first two terms in the expansion 4.3 have been included. It was shown in chapter 3 that when 


sin ~% 0 then the steady-state solution of equation 4.4 is of the form 


) = Fo cos (wt) (4.4) 


0 (t) = Acos (wt — ô) (4.5) 


Insert this first-order solution into equation 4.4, then the cubic term in the expansion gives a term cos?wt = 
1 (cos 3wt + 3coswt). Thus the perturbation expansion to third order involves a solution of the form 


0 (t) = Acos (wt — 6) + B cos 3(wt — 6) (4.6) 


This perturbation solution shows that the non-linear term has distorted the signal by addition of the third 
harmonic of the driving frequency with an amplitude that depends sensitively on 0. This illustrates that the 
superposition principle is not obeyed for this non-linear system, but, if the non-linearity is weak, perturbation 
theory can be used to derive the solution of a non-linear equation of motion. 

Figure 4.1 illustrates that for a potential U(x) = 2x? + zt, the 2? non-linear term are greatest at the 
maximum amplitude x, which makes the total energy contours in state-space more rectangular than the 
elliptical shape for the harmonic oscillator as shown in figure 3.3. The solution is of the form given in 
equation 4.6. 
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Figure 4.1: The left side shows the potential energy for a symmetric potential U(x) = 2x? + 24. The right 
side shows the contours of constant total energy on a state-space diagram. 


4.1 Example: Non-linear oscillator 


Assume that a non-linear oscillator has a potential given by 


where A is small. Find the solution of the equation of motion to first order in A, assuming x = 0 at t= 0. 
The equation of motion for the nonlinear oscillator is 


If the màx? term is neglected, then the second-order equation of motion reduces to a normal linear oscillator 
with 
xo = Asin (wot +) 


|k 
wo = = 
m 


Assume that the first-order solution has the form 


where 


Li =0+AL1 


Substituting this into the equation of motion, and neglecting terms of higher order than A, gives 
2 
Ei + wea) = rå = zl — cos (2wot)] 
To solve this try a particular integral 
zı = B+ C cos (2wot) 


and substitute into the equation of motion gives 


A? 2 
—3w2C cos (2wpt) + w2B = z es (2wot) 
Comparison of the coefficients gives 
A? 
B = —>3 
2% 
A? 
C — 
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The homogeneous equation is 
Ey + wa =0 


which has a solution of the form 
zı = Dı sin (wot) + D2 cos (wot) 


Thus combining the particular and homogeneous solutions gives 


A? A? 
xı = (A+ ADı) sin (wot) + A | 3 + D2 cos (wot) + 3 cos (2wot) 
2% 6w% 


The initial condition x = 0 at t = 0 then gives 


and 


2 
zı = (A + ADı) sin (wot) + aus E a aes (wot) + a (200) 
we 12 3 6 

The constant (A+ AD) is given by the initial amplitude and velocity. 

This system is nonlinear in that the output amplitude is not proportional to the input amplitude. Secondly, 
a large amplitude second harmonic component is introduced in the output waveform; that is, for a non-linear 
system the gain and frequency decomposition of the output differs from the input. Note that the frequency 
composition is amplitude dependent. This particular example of a nonlinear system does not exhibit chaos. 
The Laboratory for Laser Energetics uses nonlinear crystals to double the frequency of laser light. 


4.3 Bifurcation, and point attractors 


Interesting new phenomena, such as bifurcation, and attractors, occur when the non-linearity is large. In 
chapter 3 it was shown that the state-space diagram (a,x) for an undamped harmonic oscillator is an 
ellipse with dimensions defined by the total energy of the system. As shown in figure 3.5, for the damped 
harmonic oscillator, the state-space diagram spirals inwards to the origin due to dissipation of energy. Non- 
linearity distorts the shape of the ellipse or spiral on the state-space diagram, and thus the state-space, or 
corresponding phase-space, diagrams, provide useful representations of the motion of linear and non-linear 
periodic systems. 

The complicated motion of non-linear systems makes it necessary to distinguish between transient and 
asymptotic behavior. The damped harmonic oscillator executes a transient spiral motion that asymptotically 
approaches the origin. The transient behavior depends on the initial conditions, whereas the asymptotic limit 
of the steady-state solution is a specific location, that is called a point attractor. The point attractor for 
damped motion in the anharmonic potential well 


U(x) = 227 + zt (4.7) 


is at the minimum, which is the origin of the state-space diagram as shown in figure 4.1. 
The more complicated one-dimensional potential well 


U(x) = 8 — 4r? + 0.524 (4.8) 


shown in figure 4.2, has two minima that are symmetric about x = 0 with a saddle of height 8. 
The kinetic plus potential energies of a particle with mass m = 2, released in this potential, will be 
assumed to be given by 
E(x,1) = 4? +U(a) (4.9) 


The state-space plot in figure 4.2 shows contours of constant energy with the minima at (x,t) = (+2, 0). 
At slightly higher total energy the contours are closed loops around either of the two minima at x = +2. 
At total energies above the saddle energy of 8 the contours are peanut-shaped and are symmetric about 
the origin. Assuming that the motion is weakly damped, then a particle released with total energy Exotai 
which is higher than Esaddie will follow a peanut-shaped spiral trajectory centered at (x,t) = (0,0) in the 
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Figure 4.2: The left side shows the potential energy for a bimodal symmetric potential U(x) = 8 — 4x? + 
0.5a*. The right-hand figure shows contours of the sum of kinetic and potential energies on a state-space 
diagram. For total energies above the saddle point the particle follows peanut-shaped trajectories in state- 
space centered around (x, ¢) = (0,0). For total energies below the saddle point the particle will have closed 
trajectories about either of the two symmetric minima located at (x, t) = (+2,0). Thus the system solution 
bifurcates when the total energy is below the saddle point. 


state-space diagram for Erotal > Esadate- For Etotal < Esaddie there are two separate solutions for the two 
minimum centered at x = +2 and « = 0. This is an example of bifurcation where the one solution for 
Eiotal > Esaddte bifurcates into either of the two solutions for Etotal < Esadale- 

For an initial total energy Esorar > Esadate, damping will result in spiral trajectories of the particle that 
will be trapped in one of the two minima. For Esorar > Esadale the particle trajectories are centered giving 
the impression that they will terminate at (x, t) = (0,0) when the kinetic energy is dissipated. However, for 
Etotal < Esaddle the particle will be trapped in one of the two minimum and the trajectory will terminate 
at the bottom of that potential energy minimum occurring at (x, 1) = (+2,0). These two possible terminal 
points of the trajectory are called point attractors. This example appears to have a single attractor for 
Extotat > Esaddie Which bifurcates leading to two attractors at (x,t) = (+2,0) for Esotar < Esadaie. The 
determination as to which minimum traps a given particle depends on exactly where the particle starts in 
state space and the damping etc. That is, for this case, where there is symmetry about the z-axis, the 
particle has an initial total energy Eyota: > E'saddte, then the initial conditions with 7 radians of state space 
will lead to trajectories that are trapped in the left minimum, and the other 7 radians of state space will be 
trapped in the right minimum. Trajectories starting near the split between these two halves of the starting 
state space will be sensitive to the exact starting phase. This is an example of sensitivity to initial conditions. 


4.4 Limit cycles 


4.4.1 Poincaré-Bendixson theorem 


Coupled first-order differential equations in two dimensions of the form 


t= f(x,y) y = g(x,y) (4.10) 


occur frequently in physics. The state-space paths do not cross for such two-dimensional autonomous systems, 
where an autonomous system is not explicitly dependent on time. 

The Poincaré-Bendixson theorem states that, state-space, and phase-space, can have three possible paths: 

(1) closed paths, like the elliptical paths for the undamped harmonic oscillator, 

(2) terminate at an equilibrium point as t — oo, like the point attractor for a damped harmonic oscillator, 

(3) tend to a limit cycle as t — 00. 

The limit cycle is unusual in that the periodic motion tends asymptotically to the limit-cycle attractor 
independent of whether the initial values are inside or outside the limit cycle. The balance of dissipative forces 
and driving forces often leads to limit-cycle attractors, especially in biological applications. Identification of 
limit-cycle attractors, as well as the trajectories of the motion towards these limit-cycle attractors, is more 
complicated than for point attractors. 
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Figure 4.3: The Poincaré-Bendixson theorem allows the following three scenarios for two-dimensional au- 
tonomous systems. (1) Closed paths as illustrated by the undamped harmonic oscillator. (2) Terminate at 
an equilibrium point as t — oo, as illustrated by the damped harmonic oscillator, and (3) Tend to a limit 
cycle as t > o as illustrated by the van der Pol oscillator. 


4.4.2 van der Pol damped harmonic oscillator: 


The van der Pol damped harmonic oscillator illustrates a non-linear equation that leads to a well-studied, 
limit-cycle attractor that has important applications in diverse fields. The van der Pol oscillator has an 
equation of motion given by ; 

dx dx 

7p tl (a? — 1) T 
The non-linear u (x? — 1) de damping term is unusual in that the sign changes when x = 1 leading to 
positive damping for x > 1 and negative damping for x < 1. To simplify equation 4.11, assume that the term 
wôg = xq, that is, wâ = 1. 

This equation was studied extensively during the 1920’s and 1930’s by the Dutch engineer, Balthazar 
van der Pol, for describing electronic circuits that incorporate feedback. The form of the solution can be 
simplified by defining a variable y = d, Then the second-order equation 4.11 can be expressed as two 
coupled first-order equations. 


+uwjt=0 (4.11) 


dx 
= > 4.12 
y T (4.12) 
d 
-z — p(z? —1)y (4.13) 
It is advantageous to transform the (4, x) state space to polar coordinates by setting 

x = rcosé (4.14) 

y = rsin0 


and using the fact that r? = z? + y? . Therefore 
dr dx dy 


tee tls, (4.15) 
Similarly for the angle coordinate 
dx dr do 
a aa —r— si 4.16 
Ti Ti cosÓ — r T sind ( ) 
dy dr. dé 
ee y E = 4.17 
a J sind +r a cos O (4.17) 
Multiply equation 4.16 by y and 4.17 by x and subtract gives 
dé dy dx 
2” = 4.1 
“a “a “a on 
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Figure 4.4: Solutions of the van der Pol system for u = 0.2 top row and u = 5 bottom row, assuming that 
w = 1. The left column shows the time dependence x(t). The right column shows the corresponding (x, 4) 
state space plots. Upper: Weak nonlinearity, u= 0.2; At large times the solution tends to one limit 
cycle for initial values inside or outside the limit cycle attractor. The amplitude x(t) for two initial condi- 
tions approaches an approximately harmonic oscillation. Lower: Strong nonlinearity, u = 5; Solutions 
approach a common limit cycle attractor for initial values inside or outside the limit cycle attractor while 
the amplitude z(t) approaches a common approximate square-wave oscillation. 


Equations 4.15 and 4.18 allow the van der Pol equations of motion to be written in polar coordinates 


dr 


= —u (r? cos? 9 — 1) rsin? 9 (4.19) 
A = —1 — u (r? cos” 9 — 1) sin 8 cos 0 (4.20) 


The non-linear terms on the right-hand side of equations 4.19 — 20 have a complicated form. 


Weak non-linearity: y << 1 


In the limit that y —> 0, equations 4.19, 4.20 correspond to a circular state-space trajectory similar to the 
harmonic oscillator. That is, the solution is of the form 


x(t) = psin (t — to) (4.21) 


where p and ty are arbitrary parameters. For weak non-linearity, y << 1 the angular equation 4.20 has a 
rotational frequency that is unity since the sin 0 cos 9 term changes sign twice per period, in addition to the 
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small value of u. For y << 1 and r < 1, the radial equation 4.19 has a sign of the (r? cos? 0 — 1) term that 
is positive and thus the radius increases monotonically to unity. For r > 1, the bracket is predominantly 
negative resulting in a spiral decrease in the radius. Thus, for very weak non-linearity, this radial behavior 
results in the amplitude spiralling to a well defined limit-cycle attractor value of p = 2 as illustrated by 
the state-space plots in figure 4.4 for cases where the initial condition is inside or external to the circular 
attractor. The final amplitude for different initial conditions also approach the same asymptotic behavior. 


Dominant non-linearity: y >> 1 


For the case where the non-linearity is dominant, that is y >> 1, then as shown in figure 4.4, the system 
approaches a well defined attractor, but in this case it has a significantly skewed shape in state-space, while 
the amplitude approximates a square wave. The solution remains close to x = +2 until y = t = +7 and 
then it relaxes quickly to x = —2 with y = t ~ 0. This is followed by the mirror image. This behavior is 
called a relaxed vibration in that a tension builds up slowly then dissipates by a sudden relaxation process. 
The seesaw is an extreme example of a relaxation oscillator where the seesaw angle switches spontaneously 
from one solution to the other when the difference in their moment arms changes sign. 

The study of feedback in electronic circuits was the stimulus for study of this equation by van der 
Pol. However, Lord Rayleigh first identified such relaxation oscillator behavior in 1880 during studies of 
vibrations of a stringed instrument excited by a bow, or the squeaking of a brake drum. In his discussion of 
non-linear effects in acoustics, he derived the equation 


&— (a — bt’ )i + whe (4.22) 


Differentiation of Rayleigh’s equation 4.22 gives 


X — (a — 3ba?)% + wet = 0 (4.23) 
Using the substitution of 
30. 
Y = yo\/ —« (4.24) 
a 
leads to the relations 
Ñ a y E ay > ay 
i=14/ == L=,\/—— £t =, == 4.25 
3b Yo 3b Yo 3b Yo ( ) 


Substituting these relations into equation 4.23 gives 


a y a 3bay?1] y 2 Ja y 
A ol 4.2 
V 3byo V3b p b A yo OV 35 yo y cf) 


Multiplying by yo4/ 3D and rearranging leads to the van der Pol equation 


E a a 
ü- (yo — yy — wy = 0 (4.27) 
0 


The rhythm of a heartbeat driven by a pacemaker is an important application where the self-stabilization of 
the attractor is a desirable characteristic to stabilize an irregular heartbeat; the medical term is arrhythmia. 
The mechanism that leads to synchronization of the many pacemaker cells in the heart and human body due 
to the influence of an implanted pacemaker is discussed in chapter 14.12. Another biological application of 
limit cycles is the time variation of animal populations. 

In summary the non-linear damping of the van der Pol oscillator leads to a self-stabilized, single limit- 
cycle attractor that is insensitive to the initial conditions. The van der Pol oscillator has many important 
applications such as bowed musical instruments, electrical circuits, and human anatomy as mentioned above. 
The van der Pol oscillator illustrates the complicated manifestations of the motion that can be exhibited by 
non-linear systems 
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4.5 Harmonically-driven, linearly-damped, plane pendulum 


The harmonically-driven, linearly-damped, plane pendulum illustrates many of the phenomena exhibited by 
non-linear systems as they evolve from ordered to chaotic motion. It illustrates the remarkable fact that 
determinism does not imply either regular behavior or predictability. The well-known, harmonically-driven 
linearly-damped pendulum provides an ideal basis for an introduction to non-linear dynamics. 

Consider a harmonically-driven linearly-damped plane pendulum of moment of inertia J and mass m in 
a gravitational field that is driven by a torque due to a force F(t) = Fp coswt acting at a moment arm L. 
The damping term is b and the angular displacement of the pendulum, relative to the vertical, is 9. The 
equation of motion of the harmonically-driven linearly-damped simple pendulum can be written as 


IÔ + bð + mgLsin9 = LFp coswt (4.28) 


Note that the sinusoidal restoring force for the plane pendulum is non-linear for large angles 9. The natural 


period of the free pendulum is 
L 
üa = Ta (4.29) 


A dimensionless parameter y, which is called the drive strength, is defined by 


F 
y= ee (4.30) 
mg 
The equation of motion 4.28 can be generalized by introducing dimensionless units for both time + and 
relative drive frequency w defined by 


T= wot o=— (4.31) 
wo 
In addition, define the inverse damping factor Q as 
wol 
= — 4.32 
0 (4.32) 
These definitions allow equation 4.28 to be written in the dimensionless form 
d0  1d0 z 
JE + Oa + sin 0 = y cost (4.33) 


The behavior of the angle 0 for the driven damped plane pendulum depends on the drive strength y 
and the damping factor Q. Consider the case where equation 4.33 is evaluated assuming that the damping 
coefficient Q = 2, and that the relative angular frequency @ = 2, which is close to resonance where chaotic 
phenomena are manifest. The Runge-Kutta method is used to solve this non-linear equation of motion. 


4.5.1 Close to linearity 


For drive strength y = 0.2 the amplitude is sufficiently small that sin ~ 0, superposition applies, and the 
solution is identical to that for the driven linearly-damped linear oscillator. As shown in figure 4.5, once 
the transient solution dies away, the steady-state solution asymptotically approaches one attractor that has 
an amplitude of +0.3 radians and a phase shift ô with respect to the driving force. The abscissa is given 
in units of the dimensionless time f = wot. The transient solution depends on the initial conditions and 
dies away after about 5 periods, whereas the steady-state solution is independent of the initial conditions 
and has a state-space diagram that has an elliptical shape, characteristic of the harmonic oscillator. For all 
initial conditions, the time dependence and state space diagram for steady-state motion approaches a unique 
solution, called an “attractor”, that is, the pendulum oscillates sinusoidally with a given amplitude at the 
frequency of the driving force and with a constant phase shift ô, i.e. 


A(t) = Acos(wt — ô). (4.34) 


This solution is identical to that for the harmonically-driven, linearly-damped, linear oscillator discussed in 
chapter 3.6. 


1A similar approach is used by the book "Chaotic Dynamics" by Baker and Gollub[Bak96]. 
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Figure 4.5: Motion of the driven damped pendulum for drive strengths of y = 0.2, y = 0.9, y = 1.05, and 
y = 1.078. The left side shows the time dependence of the deflection angle 0 with the time axis expressed 
in dimensionless units t. The right side shows the corresponding state-space plots. These plots assume 


w= FA = 2, Q = 2, and the motion starts with 0 = w = 0. 
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y = 1.078 


Figure 4.6: The driven damped pendulum assuming that w = 2, Q = 2, with initial conditions 0(0) = —4 
w(0) = 0. The system exhibits period-two motion for drive strengths of y = 1.078 as shown by the state 
space diagram for cycles 10 — 20. For y = 1.081 the system exhibits period-four motion shown for cycles 
10 — 30. 


4.5.2 Weak nonlinearity 


Figure 4.5 shows that for drive strength y = 0.9, after the transient solution dies away, the steady-state 
solution settles down to one attractor that oscillates at the drive frequency with an amplitude of slightly 
more than 5 radians for which the small angle approximation fails. The distortion due to the non-linearity 
is exhibited by the non-elliptical shape of the state-space diagram. 

The observed behavior can be calculated using the successive approximation method discussed in chapter 


4.2. That is, close to small angles the sine function can be approximated by replacing 
1 
sind = 6 — 6° 
i 6 
in equation 4.33 to give 
1 


a és 1 pa 
2 3 ~ 
= 2 4. 
Aae (0 0 ) y cos wt ( 35) 


As a first approximation assume that 7 7 
O(t) = Acos(at — 6) 


then the small ¢° term in equation 4.35 contributes a term proportional to cos? (Gt — 6). But 
x 1 z z 
cos? (at — 6) = 1 (cos 3(@t — 6) + 3 cos(&t — ô)) 
That is, the nonlinearity introduces a small term proportional to cos 3(wt — 6). Since the right-hand side of 


equation 4.35 is a function of only coswt, then the terms in 0, 0, and Ê on the left hand side must contain 
the third harmonic cos 3(wt — ô) term. Thus a better approximation to the solution is of the form 


O(t) = A [cos(@t — 6) + ecos 3(at — 6)] (4.36) 
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where the admixture coefficient e < 1. This successive approximation method can be repeated to add 
additional terms proportional to cosn(wt — 0) where n is an integer with n > 3. Thus the nonlinearity 
introduces progressively weaker n-fold harmonics to the solution. This successive approximation approach 
is viable only when the admixture coefficient € < 1. Note that these harmonics are integer multiples of w, 
thus the steady-state response is identical for each full period even though the state space contours deviate 
from an elliptical shape. 


4.5.3 Onset of complication 


Figure 4.5 shows that for y = 1.05 the drive strength is sufficiently strong to cause the transient solution for 
the pendulum to rotate through two complete cycles before settling down to a single steady-state attractor 
solution at the drive frequency. However, this attractor solution is shifted two complete rotations relative 
to the initial condition. The state space diagram clearly shows the rolling motion of the transient solution 
for the first two periods prior to the system settling down to a single steady-state attractor. The successive 
approximation approach completely fails at this coupling strength since 0 oscillates through large values that 
are multiples of r. 

Figure 4.5 shows that for drive strength y = 1.078 the motion evolves to a much more complicated 
periodic motion with a period that is three times the period of the driving force. Moreover the amplitude 
exceeds 27 corresponding to the pendulum oscillating over top dead center with the centroid of the motion 
offset by 37 from the initial condition. Both the state-space diagram, and the time dependence of the motion, 
illustrate the complexity of this motion which depends sensitively on the magnitude of the drive strength y, 
in addition to the initial conditions, (@(0),w(0)) and damping factor Q as is shown in figure 4.6 


4.5.4 Period doubling and bifurcation 


For drive strength y = 1.078, with the initial condition (0(0), w (0)) = (0,0), the system exhibits a regular 
motion with a period that is three times the drive period. In contrast, if the initial condition is [0(0) = 
—$,w (0) = 0] then, as shown in figure 4.6, the steady-state solution has the drive frequency with no offset 
in 0, that is, it exhibits period-one oscillation. This appearance of two separate and very different attractors 
for y = 1.078, using different initial conditions, is called bifurcation. 

An additional feature of the system response for y = 1.078 is that changing the initial conditions to 
[(A(0) = —4,w (0) = 0] shows that the amplitude of the even and odd periods of oscillation differ slightly 
in shape and amplitude, that is, the system really has period-two oscillation. This period-two motion, i.e. 
period doubling, is clearly illustrated by the state space diagram in that, although the motion still is 
dominated by period-one oscillations, the even and odd cycles are slightly displaced. Thus, for different 
initial conditions, the system for y = 1.078 bifurcates into either of two attractors that have very different 
waveforms, one of which exhibits period doubling. 

The period doubling exhibited for y = 1.078, is followed by a second period doubling when y = 1.081 as 
shown in figure 4.6 . With increase in drive strength this period doubling keeps increasing in binary multiples 
to period 8, 16, 32, 64 etc. Numerically it is found that the threshold for period doubling is y, = 1.0663, 
from two to four occurs at yz = 1.0793 etc. Feigenbaum showed that this cascade increases with increase in 
drive strength according to the relation that obeys 


where 6 = 4.6692016, 6 is called a Feigenbaum number. As n — oo this cascading sequence goes to a limit 
Ye where 
Ye = 1.0829 (4.38) 


4.5.5 Rolling motion 


It was shown that for y > 1.05 the transient solution causes the pendulum to have angle excursions exceeding 
27, that is, the system rolls over top dead center. For drive strengths in the range 1.3 < y < 1.4, the steady- 
state solution for the system undergoes continuous rolling motion as illustrated in figure 4.7. The time 
dependence for the angle exhibits a periodic oscillatory motion superimposed upon a monotonic rolling 


motion, whereas the time dependence of the angular frequency w = ao is periodic. The state space plots 
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Figure 4.7: Rolling motion for the driven damped plane pendulum for y = 1.4. (a) The time dependence 
of angle 0(t) increases by 27 per drive period whereas (b) the angular velocity w(t) exhibits periodicity. (c) 
The state space plot for rolling motion is shown with the origin shifted by 27 per revolution to keep the plot 
within the bounds —7 < 0 < +7 


for rolling motion corresponds to a chain of loops with a spacing of 27 between each loop. The state space 
diagram for rolling motion is more compactly presented if the origin is shifted by 27 per revolution to keep 
the plot within bounds as illustrated in figure 4.7c. 


4.5.6 Onset of chaos 


When the drive strength is increased to y = 1.105, then the system does not approach a unique attractor 
as illustrated by figure 4.8le ft which shows state space orbits for cycles 25 — 200. Note that these orbits do 
not repeat implying the onset of chaos. For drive strengths greater than y. = 1.0829 the driven damped 
plane pendulum starts to exhibit chaotic behavior. The onset of chaotic motion is illustrated by making a 3- 
dimensional plot which combines the time coordinate with the state-space coordinates as illustrated in figure 
4.8right. This plot shows 16 trajectories starting at different initial values in the range —0.15 < 0 < 0.15 
for y = 1.168. Some solutions are erratic in that, while trying to oscillate at the drive frequency, they never 
settle down to a steady periodic motion which is characteristic of chaotic motion. Figure 4.8right illustrates 
the considerable sensitivity of the motion to the initial conditions. That is, this deterministic system can 
exhibit either order, or chaos, dependent on miniscule differences in initial conditions. 
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Figure 4.8: Left: Space-space orbits for the driven damped pendulum with y = 1.105. Note that the orbits 
do not repeat for cycles 25 to 200. Right: Time-state-space diagram for y = 1.168. The plot shows 16 
trajectories starting with different initial values in the range —0.15 < 0 < 0.15. 
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Figure 4.9: State-space plots for the harmonically-driven, linearly-damped, pendulum for driving amplitudes 
of Fp = 0.5 and Fp = 1.2. These calculations were performed using the Runge-Kutta method by E. Shah, 
(Private communication) 


4.6 Differentiation between ordered and chaotic motion 


Chapter 4.5 showed that motion in non-linear systems can exhibit both order and chaos. The transition 
between ordered motion and chaotic motion depends sensitively on both the initial conditions and the model 
parameters. It is surprisingly difficult to unambiguously distinguish between complicated ordered motion 
and chaotic motion. Moreover, the motion can fluctuate between order and chaos in an erratic manner 
depending on the initial conditions. The extremely sensitivity to initial conditions of the motion for non- 
linear systems, makes it essential to have quantitative measures that can characterize the degree of order, and 
interpret the complicated dynamical motion of systems. As an illustration, consider the harmonically-driven, 
linearly-damped, pendulum with Q = 2, and driving force F(t) = Fp sin &t where & = 3. Figure 4.9 shows 
the state-space plots for two driving amplitudes, Fp = 0.5 which leads to ordered motion, and Fp = 1.2 
which leads to possible chaotic motion. It can be seen that for Fp = 0.5 the state-space diagram converges 
to a single attractor once the transient solution has died away. This is in contrast to the case for Fp = 1.2, 
where the state-space diagram does not converge to a single attractor, but exhibits possible chaotic motion. 
Three quantitative measures can be used to differentiate ordered motion from chaotic motion for this system; 
namely, the Lyapunov exponent, the bifurcation diagram, and the Poincaré section, as illustrated below. 


4.6.1 Lyapunov exponent 


The Lyapunov exponent provides a quantitative and useful measure of the instability of trajectories, and how 
quickly nearby initial conditions diverge. It compares two identical systems that start with an infinitesimally 
small difference in the initial conditions in order to ascertain whether they converge to the same attractor 
at long times, corresponding to a stable system, or whether they diverge to very different attractors, charac- 
teristic of chaotic motion. If the initial separation between the trajectories in phase space at t = 0 is |dZol, 
then to first order the time dependence of the difference can be assumed to depend exponentially on time. 
That is, 

|5Z(t)| ~ e™ |Zo| (4.39) 


where A is the Lyapunov exponent. That is, the Lyapunov exponent is defined to be 


A= lim lim ti KAH] 
t>00 ôZo—0 t |Zo| 


(4.40) 


Systems for which the Lyapunov exponent A < 0 (negative), converge exponentially to the same attractor 
solution at long times since |5Z(t)| — 0 for t > oo. By contrast, systems for which A > 0 (positive) diverge 
to completely different long-time solutions, that is, |5Z(t)| — oo for t — oo. Even for infinitesimally 
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Figure 4.10: Lyapunov plots of A@ versus time for two initial starting points differing by Afo = 0.001rads. 
The parameters are Q = 2, and F(t) = Fp sin(5t), and At = 0.04s. The Lyapunov exponent for Fp = 0.5 
which is drawn as a dashed line, is convergent with À = —0.251. For Fp = 1.2 the exponent is divergent as 
indicated by the dashed line which as a slope of A = 0.1538. These calculations were performed using the 
Runge-Kutta method by E. Shah, (Private communication) 


small differences in the initial conditions, systems having a positive Lyapunov exponent diverge to different 
attractors, whereas when the Lyapunov exponent A < 0 they correspond to stable solutions. 

Figure 4.10 illustrates Lyapunov plots for the harmonically-driven, linearly-damped, plane pendulum, 
with the same conditions discussed in chapter 4.5. Note that for the small driving amplitude F'p = 0.5, 
the Lyapunov plot converges to ordered motion with an exponent A = —0.251, whereas for Fp = 1.2, the 
plot diverges characteristic of chaotic motion with an exponent = 0.1538. The Lyapunov exponent usually 
fluctuates widely at the local oscillator frequency, and thus the time average of the Lyapunov exponent must 
be taken over many periods of the oscillation to identify the general trend with time. Some systems near an 
order-to-chaos transition can exhibit positive Lyapunov exponents for short times, characteristic of chaos, 
and then converge to negative A at longer time implying ordered motion. The Lyapunov exponents are 
used extensively to monitor the stability of the solutions for non-linear systems. For example the Lyapunov 
exponent is used to identify whether fluid flow is laminar or turbulent as discussed in chapter 16.8. 

A dynamical system in n-dimensional phase space will have a set of n Lyapunov exponents {A1, Az, ..., An} 
associated with a set of attractors, the importance of which depend on the initial conditions. Typically one 
Lyapunov exponent dominates at one specific location in phase space, and thus it is usual to use the maximal 
Lyapunov exponent to identify chaos. The Lyapunov exponent is a very sensitive measure of the onset of chaos 
and provides an important test of the chaotic nature for the complicated motion exhibited by non-linear 
systems. 


4.6.2 Bifurcation diagram 


The bifurcation diagram simplifies the presentation of the dynamical motion by sampling the status of 
the system once per period, synchronized to the driving frequency, for many sets of initial conditions. The 
results are presented graphically as a function of one parameter of the system in the bifurcation diagram. For 
example, the wildly different behavior in the driven damped plane pendulum is represented on a bifurcation 
diagram in figure 4.11, which shows the observed angular velocity w of the pendulum sampled once per drive 
cycle plotted versus drive strength. The bifurcation diagram is obtained by sampling either the angle 0, 
or angular velocity w, once per drive cycle, that is, it represents the observables of the pendulum using a 
stroboscopic technique that samples the motion synchronous with the drive frequency. Bifurcation plots also 
can be created as a function of either the time t, the damping factor Q , the normalized frequency & = ro 
or the driving amplitude y. 
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In the domain with drive strength y < 
1.0663 there is one unique angle each drive 
cycle as illustrated by the bifurcation di- 
agram. For slightly higher drive strength 
period-two bifurcation behavior results in 
two different angles per drive cycle. The 
Lyapunov exponent is negative for this re- 
gion corresponding to ordered motion. The 
cascade of period doubling with increase in 
drive strength is readily apparent until chaos 
sets in at the critical drive strength y. when 
there is a random distribution of sampled an- 
gular velocities and the Lyapunov exponent 
becomes positive. Note that at y = 1.0845 
there is a brief interval of period-6 motion 
followed by another region of chaos. Around 
y = 1.1 there is a region that is primarily 
chaotic which is reflected by chaotic values of 
the angular velocity on the bifurcation plot 
and large positive values of the Lyapunov ex- 
ponent. The region around y = 1.12 exhibits 
period three motion and negative Lyapunov 
exponent corresponding to ordered motion. 
The 1.15 < y < 1.25 region is mainly chaotic 
and has a large positive Lyapunov exponent. 
The region with 1.3 < y < 1.4 is striking 
in that this corresponds to rolling motion 
with reemergence of period one and negative 
Lyapunov exponent. This period-1 motion 
is due to a continuous rolling motion of the 
plane pendulum as shown in figure 4.7 where it is seen that the average 0 increases 27 per cycle, whereas the 
angular velocity w exhibits a periodic motion. That is, on average the pendulum is rotating 27 per cycle. 
Above y = 1.4 the system start to exhibit period doubling followed by chaos reminiscent of the behavior 
seen at lower y values. 

These results show that the bifurcation diagram nicely illustrates the order to chaos transitions for the 
harmonically-driven, linearly-damped, pendulum. Several transitions between order and chaos are seen to 
occur. The apparent ordered and chaotic regimes are confirmed by the corresponding Lyapunov exponents 
which alternate between negative and positive values for the ordered and chaotic regions respectively. 


Figure 4.11: Bifurcation diagram samples the angular velocity 
w once per period for the driven, linearly-damped, plane pen- 
dulum plotted as a function of the drive strength y. Regions 
of period doubling, and chaos, as well as islands of stability 
all are manifest as the drive strength y is changed. Note that 
the limited number of samples causes broadening of the lines 
adjacent to bifurcations. 


4.6.3 Poincaré Section 


State-space plots are very useful for characterizing periodic motion, but they become too dense for useful 
interpretation when the system approaches chaos as illustrated in figure 4.11. Poincaré sections solve this 
difficulty by taking a stroboscopic sample once per cycle of the state-space diagram. That is, the point on 
the state space orbit is sampled once per drive frequency. For period-1 motion this corresponds to a single 
point (0,w). For period-2 motion this corresponds to two points etc. For chaotic systems the sequence of 
state-space sample points follow complicated trajectories. Figure 4.12 shows the Poincaré sections for the 
corresponding state space diagram shown in figure 4.9 for cycles 10 to 6000. Note the complicated curves do 
not cross or repeat. Enlargements of any part of this plot will show increasingly dense parallel trajectories, 
called fractals, that indicates the complexity of the chaotic cyclic motion. That is, zooming in on a small 
section of this Poincaré plot shows many closely parallel trajectories. The fractal attractors are surprisingly 
robust to large differences in initial conditions. Poincaré sections are a sensitive probe of periodic motion 
for systems where periodic motion is not readily apparent. 

In summary, the behavior of the well-known, harmonically-driven, linearly-damped, plane pendulum 
becomes remarkably complicated at large driving amplitudes where non-linear effects dominate. That is, 
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Figure 4.12: Three Poincaré section plots for the harmonically-driven, linearly-damped, pendulum for various 
initial conditions with Fp = 1.2,0 = 2, and At = zp- These calculations used the Runge-Kutta method 
and were performed for 6000cycles by E. Shah (Private communication). 


when the restoring force is non-linear. The system exhibits bifurcation where it can evolve to multiple 
attractors that depend sensitively on the initial conditions. The system exhibits both oscillatory, and rolling, 
solutions depending on the amplitude of the motion. The system exhibits domains of simple ordered motion 
separated by domains of very complicated ordered motion as well as chaotic regions. The transitions between 
these dramatically different modes of motion are extremely sensitive to the amplitude and phase of the 
driver. Eventually the motion becomes completely chaotic. The Lyapunov exponent, bifurcation diagram, 
and Poincaré section plots, are sensitive measures of the order of the motion. These three sensitive measures 
of order and chaos are used extensively in many fields in classical mechanics. Considerable computing 
capabilities are required to elucidate the complicated motion involved in non-linear systems. Examples 
include laminar and turbulent flow in fluid dynamics and weather forecasting of hurricanes, where the 
motion can span a wide dynamic range in dimensions from 1075 to 104m. 


4.7 Wave propagation for non-linear systems 


4.7.1 Phase, group, and signal velocities 


Chapter 3 discussed the wave equation and solutions for linear systems. It was shown that, for linear systems, 
the wave motion obeys superposition and exhibits dispersion, that is, a frequency-dependent phase velocity, 
and, in some cases, attenuation. Nonlinear systems introduce intriguing new wave phenomena. For example 
for nonlinear systems, second, and higher terms must be included in the Taylor expansion given in equation 
4.2. These second and higher order terms result in the group velocity being a function of w, that is, group 
velocity dispersion occurs which leads to the shape of the envelope of the wave packet being time dependent. 
As a consequence the group velocity in the wave packet is not well defined, and does not equal the signal 
velocity of the wave packet or the phase velocity of the wavelets. Nonlinear optical systems have been studied 
experimentally where Uyroup << c, which is called slow light, while other systems have vgroup > c which is 
called superluminal light. The ability to control the velocity of light in such optical systems is of considerable 
current interest since it has signal transmission applications. 
The dispersion relation for a nonlinear system can be expressed as a Taylor expansion of the form 


ðk 1 (@k i 
k= ko + ee (w = wo) + 3 (el (w = wo) “Foss (4.41) 


where w is used as the independent variable since it is invariant to phase transitions of the system. Note 
that the factor for the first derivative term is the reciprocal of the group velocity 


Ok 1 
( 7) = (4.42) 
W=Wo group 
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while the factor for the second derivative term is 


(E _ a | 1 | E ( 1 Spo) dn 
Ow? ay Oe Ügroup (X) PP Vrou ðw PEF] y 
which gives the velocity dispersion for the system. 
Since y 
k= (4.44) 
Uphase 
then "E 
Ok 1 1 7 
= = + (y press (4.45) 
Ow Ugroup Uphase Ow 


The inverse velocities for electromagnetic waves are best represented in terms of the corresponding refractive 
indices n, where 


n=— (4.46) 
Uphase 
and the group refractive index 
c 
Ngroup = (4.47) 
Ugroup 
Then equation 4.45 can be written in the more convenient form 
On 
Ngroup = N + v (4.48) 
Wave propagation for an optical system that 
is subject to a single resonance gives one ex- 
ample of nonlinear frequency response that has 
applications to optics. n 
Figure 4.13 shows that the real ng and imag- R 
inary ny parts of the phase refractive index ex- w 
hibit the characteristic resonance frequency de- 
pendence of the sinusoidally-driven, linear oscil- 
lator that was discussed in chapter 3.6 and as 
illustrated in figure 3.10. Figure 4.13 also shows n 


the group refractive index Nngroup computed us- I 
ing equation 4.48. 
Note that at resonance, Ngroup is reduced be- 
low the non-resonant value which corresponds 
to superluminal (fast) light, whereas in the 
wings of the resonance Ngroup is larger than the 
non-resonant value corresponding to slow light. 
Thus the nonlinear dependence of the refractive n 
index n on angular frequency w leads to fast 8 w 
or slow group velocities for isolated wave pack- 
ets. Velocities of light as slow as 17m/ sec have 
been observed. Experimentally the energy ab- 
sorption that occurs on resonance makes it dif- 
ficult to observe the superluminal electromag- 
netic wave at resonance. 


Note that Sommerfeld and Brillouin showed Figure 4.13: The real and imaginary parts of the phase 
that even though the group velocity may exceed refractive index n plus the real part of the group refractive 


c, the signal velocity, which marks the arrival of index associated with an isolated atomic resonance. 
the leading edge of the optical pulse, does not 


exceed c, the velocity of light in vacuum, as was 
postulated by Einstein. [Bril4] 
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4.7.2 Soliton wave propagation 


The soliton is a fascinating and very special 
wave propagation phenomenon that occurs for 
certain non-linear systems. The soliton is a self- 
reinforcing solitary localized wave packet that 
maintains its shape while travelling long distances 
at a constant speed. Solitons are caused by a 
cancellation of phase modulation resulting from 
non-linear velocity dependence, and the group ve- 
locity dispersive effects in a medium. Solitons 
arise as solutions of a widespread class of weakly- 
nonlinear dispersive partial differential equations 
describing many physical systems. Figure 4.14 
shows a soliton comprising a solitary water wave 
approaching the coast of Hawaii. While the soli- 
ton in Fig. 4.14 may appear like a normal wave, 
it is unique in that there are no other waves ac- 
companying it. This wave was probably created 
far away from the shore when a normal wave was Figure 4.14: A solitary wave approaches the coast of Hawaii. 


modulated by a geometrical change in the ocean (Image: Robert Odom/University of Washington) 
depth, such as the rising sea floor, which forced 


it into the appropriate shape for a soliton. The 
wave then was able to travel to the coast intact, 
despite the apparently placid nature of the ocean near the beach. Solitons are notable in that they interact 
with each other in ways very different from normal waves. Normal waves are known for their complicated 
interference patterns that depend on the frequency and wavelength of the waves. Solitons, can pass right 
through each other without being a affected at all. This makes solitons very appealing to scientists because 
soliton waves are more sturdy than normal waves, and can therefore be used to transmit information in ways 
that are distinctly different than for normal wave motion. For example, optical solitons are used in optical 
fibers made of a dispersive, nonlinear optical medium, to transmit optical pulses with an invariant shape. 

Solitons were first observed in 1834 by John Scott Russell (1808 — 1882). Russell was an engineer con- 
ducting experiments to increase the efficiency of canal boats. His experimental and theoretical investigations 
allowed him to recreate the phenomenon in wave tanks. Through his extensive studies, Scott Russell noticed 
that soliton propagation exhibited the following properties: 

e The waves are stable and hold their shape for long periods of time. 

e The waves can travel over long distances at uniform speed. 

e The speed of propagation of the wave depends on the size of the wave, with larger waves traveling 
faster than smaller waves. 

e The waves maintained their shape when they collided - seemingly passing right through each other. 

Scott Russell’s work was met with scepticism by the scientific community. The problem with the Wave 
of Translation was that it was an effect that depended on nonlinear effects, whereas previously existing 
theories of hydrodynamics (such as those of Newton and Bernoulli) only dealt with linear systems. George 
Biddell Airy, and George Gabriel Stokes, published papers attacking Scott Russell’s observations because 
the observations could not be explained by their theories of wave propagation in water. Regardless, Scott 
Russell was convinced of the prime importance of the Wave of Translation, and history proved that he was 
correct. Scott Russell went on to develop the “wave line” system of hull construction that revolutionized 
nineteenth century naval architecture, along with a number of other great accomplishments leading him to 
fame and prominence. Despite all of the success in his career, he continued throughout his life to pursue his 
studies of the Wave of Translation. 

In 1895 Korteweg and de Vries developed a wave equation for surface waves for shallow water. 

dy Od Od 
En + Jz +6 =0 (4.49) 


A solution of this equation has the characteristics of a solitary wave with fixed shape. It is given by 
substituting the form p(x,t) = f(x — vt) into the Korteweg-de Vries equation which gives 


108 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS 


of f 


af 6 =0 4.50 
“dx * 0x3 il Pa 30) 
Integrating with respect to x gives 
d 
Be LD iO (4.51) 
dx 


where C is a constant of integration. This non-linear equation has a solution 
1 
o(a,t) = zesec h? [Ze —vt— a] (4.52) 


where a is a constant. Equation 4.52 is the equation of a solitary wave moving in the +x direction at a 
velocity v. 

Soliton behavior is observed in phenomena such as tsunamis, tidal bores that occur for some rivers, 
signals in optical fibres, plasmas, atmospheric waves, vortex filaments, superconductivity, and gravitational 
fields having cylindrical symmetry. Much work has been done on solitons for fibre optics applications. The 
soliton’s inherent stability make long-distance transmission possible without the use of repeaters, and could 
potentially double the transmission capacity. 

Before the discovery of solitons, mathematicians were under the impression that nonlinear partial differ- 
ential equations could not be solved exactly. However, solitons led to the recognition that there are non-linear 
systems that can be solved analytically. This discovery has prompted much investigation into these so-called 
“integrable systems.” Such systems are rare, as most non-linear differential equations admit chaotic behavior 
with no explicit solutions. Integrable systems nevertheless lead to very interesting mathematics ranging from 
differential geometry and complex analysis to quantum field theory and fluid dynamics. 

Many of the fundamental equations in physics (Maxwell’s, Schrédinger’s) are linear equations. However, 
physicists have begun to recognize many areas of physics in which nonlinearity can result in qualitatively 
new phenomenon which cannot be constructed via perturbation theory starting from linearized equations. 
These include phenomena in magnetohydrodynamics, meteorology, oceanography, condensed matter physics, 
nonlinear optics, and elementary particle physics. For example, the European space mission Cluster detected 
a soliton-like electrical disturbances that travelled through the ionized gas surrounding the Earth starting 
about 50,000 kilometers from Earth and travelling towards the planet at about 8 km/s. It is thought that 
this soliton was generated by turbulence in the magnetosphere. 

Efforts to understand the nonlinearity of solitons has led to much research in many areas of physics. In 
the context of solitons, their particle-like behavior (in that they are localized and preserved under collisions) 
leads to a number of experimental and theoretical applications. The technique known as bosonization allows 
viewing particles, such as electrons and positrons, as solitons in appropriate field equations. There are 
numerous macroscopic phenomena, such as internal waves on the ocean, spontaneous transparency, and the 
behavior of light in fiber optic cable, that are now understood in terms of solitons. These phenomena are 
being applied to modern technology. 


4.8 Summary 


The study of the dynamics of non-linear systems remains a vibrant and rapidly evolving field in classical 
mechanics as well as many other branches of science. This chapter has discussed examples of non-linear 
systems in classical mechanics. It was shown that the superposition principle is broken even for weak 
nonlinearity. It was shown that increased nonlinearity leads to bifurcation, point attractors, limit-cycle 
attractors, and sensitivity to initial conditions. 

Limit-cycle attractors: The Poincaré-Bendixson theorem for limit cycle attractors states that the 
paths, both in state-space and phase-space, can have three possible paths: 

(1) closed paths, like the elliptical paths for the undamped harmonic oscillator, 

(2) terminate at an equilibrium point as t — oo, like the point attractor for a damped harmonic oscillator, 

(3) tend to a limit cycle as t > oo. 

The limit cycle is unusual in that the periodic motion tends asymptotically to the limit-cycle attractor 
independent of whether the initial values are inside or outside the limit cycle. The balance of dissipative forces 
and driving forces often leads to limit-cycle attractors, especially in biological applications. Identification of 
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limit-cycle attractors, as well as the trajectories of the motion towards these limit-cycle attractors, is more 
complicated than for point attractors. 

The van der Pol oscillator is a common example of a limit-cycle system that has an equation of motion 
of the form 

dx dx 
dt? dt 

The van der Pol oscillator has a limit-cycle attractor that includes non-linear damping and exhibits 
periodic solutions that asymptotically approach one attractor solution independent of the initial conditions. 
There are many examples in nature that exhibit similar behavior. 

Harmonically-driven, linearly-damped, plane pendulum: The non-linearity of the well-known 
driven linearly-damped plane pendulum was used as an example of the behavior of non-linear systems in 
nature. It was shown that non-linearity leads to discontinuous period bifurcation, extreme sensitivity to 
initial conditions, rolling motion and chaos. 

Differentiation between ordered and chaotic motion: Lyapunov exponents, bifurcation diagrams, 
and Poincaré sections were used to identify the transition from order to chaos. Chapter 16.8 discusses 
the non-linear Navier-Stokes equations of viscous-fluid flow which leads to complicated transitions between 
laminar and turbulent flow. Fluid flow exhibits remarkable complexity that nicely illustrates the dominant 
role that non-linearity can have on the solutions of practical non-linear systems in classical mechanics. 

Wave propagation for non-linear systems: Non-linear equations can lead to unexpected behavior 
for wave packet propagation such as fast or slow light as well as soliton solutions. Moreover, it is notable 
that some non-linear systems can lead to analytic solutions. 

The complicated phenomena exhibited by the above non-linear systems is not restricted to classical 
mechanics, rather it is a manifestation of the mathematical behavior of the solutions of the differential 
equations involved. That is, this behavior is a general manifestation of the behavior of solutions for second- 
order differential equations. Exploration of this complex motion has only become feasible with the advent 
of powerful computer facilities during the past three decades. The breadth of phenomena exhibited by 
these examples is manifest in myriads of other nonlinear systems, ranging from many-body motion, weather 
patterns, growth of biological species, epidemics, motion of electrons in atoms, etc. Other examples of non- 
linear equations of motion not discussed here, are the three-body problem, which is mentioned in chapter 
11, and turbulence in fluid flow which is discussed in chapter 16. 

It is stressed that the behavior discussed in this chapter is very different from the random walk problem 
which is a stochastic process where each step is purely random and not deterministic. This chapter has 
assumed that the motion is fully deterministic and rigorously follows the laws of classical mechanics. Even 
though the motion is fully deterministic, and follows the laws of classical mechanics, the motion is extremely 
sensitive to the initial conditions and the non-linearities can lead to chaos. Computer modelling is the only 
viable approach for predicting the behavior of such non-linear systems. The complexity of solving non-linear 
equations is the reason that this book will continue to consider only linear systems. Fortunately, in nature, 
non-linear systems can be approximately linear when the small-amplitude assumption is applicable. 


+ p(x? —1) — + war = 0 (4.11) 


Workshop exercises 
1. Consider the chaotic motion of the driven damped pendulum whose equation of motion is given by 
$+ Tó +? sin d = yw? cos wé 
for which the Lyapunov exponent is À = 1 with time measured in units of the drive period. 


(a) Assume that you need to predict ¢ (t) with accuracy of 10~?radians, and that the initial value ¢ (0) is 
known to within 10~® radians. What is the maximum time horizon tmax for which you can predict o (t) 
to within the required accuracy? 


(b) Suppose that you manage to improve the accuracy of the initial value to 10-° radians (that is, a thousand- 
fold improvement). What is the time horizon now for achieving the accuracy of 10~?radians? 


(c) By what factor has tmax improved with the 1000 — fold improvement in initial measurement. 


(d) What does this imply regarding long-term predictions of chaotic motion? 
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2. A non-linear oscillator satisfies the equation % + t3 +x = 0. Find the polar equations for the motion in the 
state-space diagram. Show that any trajectory that starts within the circle r < 1 encircle the origin infinitely 
many times in the clockwise direction. Show further that these trajectories in state space terminate at the 
origin. 


3. Consider the system of a mass suspended between two identical springs as shown. 


f=f,.+d = ps 
z Oy 
m Xx $! 
f: uy 
is } NA 


If each spring is stretched a distance d to attach the mass at the equilibrium position the mass is subject to 
two equal and oppositely directed forces of magnitude kd. Ignore gravity. Show that the potential in which 


stay = {th an O 


Construct a state-space diagram for this potential. 


the mass moves is approximately 


Problems 
1. A non-linear oscillator satisfies the equation 
£+(2?+24?-lt+2=0 


Find the polar equations for the motion in the state-space diagram. Show that any trajectory that starts in 
the domain 1 < r < V3 spirals clockwise and tends to the limit cycle r = 1. [The same is true of trajectories 
that start in the domain 0 < r < 1. ] What is the period of the limit cycle? 


2. A mass m moves in one direction and is subject to a constant force +Fo when x < 0 and to a constant force 
— Fo when x > 0. Describe the motion by constructing a state space diagram. Calculate the period of the 
motion in terms of m, Fo and the amplitude A. Disregard damping. 


3. Investigate the motion of an undamped mass subject to a force of the form 


—ka |x] <a 


P= GU rte erie |z| > a 


Chapter 5 


Calculus of variations 


5.1 Introduction 


The prior chapters have focussed on the intuitive Newtonian approach to classical mechanics, which is based 
on vector quantities like force, momentum, and acceleration. Newtonian mechanics leads to second-order 
differential equations of motion. The calculus of variations underlies a powerful alternative approach to 
classical mechanics that is based on identifying the path that minimizes an integral quantity. This integral 
variational approach was first championed by Gottfried Wilhelm Leibniz, contemporaneously with Newton's 
development of the differential approach to classical mechanics. 

During the 18” century, Bernoulli, who was a student of Leibniz, developed the field of variational 
calculus which underlies the integral variational approach to mechanics. He solved the brachistochrone 
problem which involves finding the path for which the transit time between two points is the shortest. The 
integral variational approach also underlies Fermat's principle in optics, which can be used to derive that 
the angle of reflection equals the angle of incidence, as well as derive Snell's law. Other applications of the 
calculus of variations include solving the catenary problem, finding the maximum and minimum distances 
between two points on a surface, polygon shapes having the maximum ratio of enclosed area to perimeter, 
or maximizing profit in economics. Bernoulli, developed the principle of virtual work used to describe 
equilibrium in static systems, and d'Alembert extended the principle of virtual work to dynamical systems. 
Euler, the preeminent Swiss mathematician of the 18°’ century and a student of Bernoulli, developed the 
calculus of variations with full mathematical rigor. The culmination of the development of the Lagrangian 
variational approach to classical mechanics is done by Lagrange (1736-1813), who was a student of Euler,. 

The Euler-Lagrangian approach to classical mechanics stems from a deep philosophical belief that the 
laws of nature are based on the principle of economy.That is, the physical universe follows paths through 
space and time that are based on extrema principles. The standard Lagrangian L is defined as the difference 
between the kinetic and potential energy, that is 


L=T-U (5.1) 


Chapters 6 through 9 will show that the laws of classical mechanics can be expressed in terms of Hamilton's 
variational principle which states that the motion of the system between the initial time t,and final time 
tə follows a path that minimizes the scalar action integral S defined as the time integral of the Lagrangian. 


ta 
S= | Ldt (5.2) 
tı 


The calculus of variations provides the mathematics required to determine the path that minimizes the 
action integral. This variational approach is both elegant and beautiful, and has withstood the rigors of 
experimental confirmation. In fact, not only is it an exceedingly powerful alternative approach to the intuitive 
Newtonian approach in classical mechanics, but Hamilton’s variational principle now is recognized to be more 
fundamental than Newton’s Laws of Motion. The Lagrangian and Hamiltonian variational approaches to 
mechanics are the only approaches that can handle the Theory of Relativity, statistical mechanics, and the 
dichotomy of philosophical approaches to quantum physics. 
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5.2 Euler’s differential equation 


The calculus of variations, presented here, underlies the powerful variational approaches that were developed 
for classical mechanics. Variational calculus, developed for classical mechanics, now has become an essential 
approach to many other disciplines in science, engineering, economics, and medicine. 

For the special case of one dimension, the calculus of variations reduces to varying the function y(x) such 
that the scalar functional F is an extremum, that is, it is a maximum or minimum, where. 


F= [rita (5.3) 


Here x is the independent variable, y(x) the dependent variable, plus its first derivative y’ = SE. The quantity 
Flylo), y (%); x] has some given dependence on y, y' and x. The calculus of variations involves varying the 
function y(x) until a stationary value of F is found, which is presumed to be an extremum. This means that 
if a function y = y(x) gives a minimum value for the scalar functional F, then any neighboring function, no 
matter how close to y(x), must increase F. For all paths, the integral F is taken between two fixed points, 
zı, yı and x2, ya. Possible paths between the initial and final points are illustrated in figure 5.1. Relative to 
any neighboring path, the functional F must have a stationary value which is presumed to be the correct 
extremum path. 

Define a neighboring function using a parametric representation y(e, x), such that for e = 0, y = y(0, £) = 
y(x) is the function that yields the extremum for F. Assume that an infinitesimally small fraction e of the 
neighboring function n(x) is added to the extremum path y(x). That is, assume 


ye) = y(0,2) + enle) (5.4) 
He ay = Wer) _ dy(0,2) , da 
PAE a a eee 


where it is assumed that the extremum function y(0, x) and the auxiliary function n(x) are well behaved 
functions of x with continuous first derivatives, and where n(x) vanishes at xı and x2, because, for all possible 
paths, the function y(e, x) must be identical with y(x) at the end points of the path, i.e. n(11) = n(x2) = 0. 
The situation is depicted in figure 5.1. It is possible to express any such parametric family of curves F as 
a function of e 


FO = | fiyle2)-u 2) de (5.5) 


The condition that the integral has a stationary (extremum) value is that F be independent of e to first 
order along the path. That is, the extremum value occurs for e = 0 where 


(=) o =0 (5.6) 


for all functions n(x). This is illustrated on the right side of figure 5.1. 
Applying condition (5.6) to equation (5.5), and since x is independent of e, then 


OF ef Pi OF MIE AE 
F ant a oN ee 


1 


Since the limits of integration are fixed, the differential operation affects only the integrand. From equations 
(5.4), 


o 

S- = nle) (5.8) 
and a F 

> = x (5.9) 


Consider the second term in the integrand 


T2 / T2 
OF OY r= Of dng 


>e = ELN 5.10 
zı Oy! Oe i z, Oy! dx j any) 


1 
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y(x) 
F (e) 


y(x) + en(x) 


Varied path 


Extremum path, y(x) 


X; X2 


n) 


Figure 5.1: The left shows the extremum y(x) and neighboring paths y(e, x) = y(x) + en(x) between (x1, y1) 
and (x2, y2) that minimizes the function F = Ja Flylo), y (x); 1] dz. The right shows the dependence of F 
as a function of the admixture coefficient e for a maximum (upper) or a minimum (lower) at e = 0. 


fw = uv — fou (5.11) 


"2 OF dn Of | f> 0 
OH qa ZO ai Mee ag he (5.12) 


Integrate by parts 


gives 


Note that the first term on the right-hand side is zero since by definition ĝu = (xz) = 0 at zı and z2. Thus 


oF f> (Of dy 0f dy’ =. [ (of 
Oe -f (3 a Oe w= f Oy a 


Thus equation 5.7 reduces to 


8 
Ba, 
a 
Sja 
A S 
Q 
SE 
NnN—_ 
Nn—_” 
a 
8 


OF “2 (Of df 
— = — —- — d 5.13 
Oe A (3 dx 75) di ( ) 
The function oe will be an extremum if it is stationary at e = 0. That is, 
OF “2 (Of df 
sicker A dx =0 14 
Oe a (5 dx ar) Ke ae Cay) 
This integral now appears to be independent of e. However, the functions y and y’ occurring in the derivatives 


are functions of e. Since (2) _, must vanish for a stationary value, and because n(x) is an arbitrary function 


subject to the conditions stated, then the above integrand must be zero. This derivation that the integrand 
must be zero leads to Euler’s differential equation 

o d 0 

E (5.15) 

Oy dx Oy’ 
where y and y' are the original functions, independent of e. The basis of the calculus of variations is that the 
function y(x) that satisfies Euler’s equation is an stationary function. Note that the stationary value could 
be either a maximum or a minimum value. When Euler’s equation is applied to mechanical systems using 
the Lagrangian as the functional, then Euler’s differential equation is called the Euler-Lagrange equation. 
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5.3 Applications of Euler’s equation 


5.1 Example: Shortest distance between two points 


Consider the path lies in the x — y plane. The infinitessimal length of arc is 


dx 


2 
ds = y dx? + dy? = 1+ (2) dx 


Then the length of the arc is 


The function f is 


f=y1+(Y)y A 
Therefore 
of XY, 
EREN 
Oy 
and 
Of _ y XY, 
Yo fito? 


Inserting these into Euler’s equation 5.15 gives 


d y x 
pas dx 1+ (yy z Shortest distance between two points in a plane. 
y 
that is 
y! 
= constant = C 
1+ (y 


This is valid if 
y => ns == 
v1- C? 


Therefore 
y=ax+b 


which is the equation of a straight line in the plane. Thus the shortest path between two points in a plane is 
a straight line between these points, as is intuitively obvious. This stationary value obviously is a minimum. 

This trivial example of the use of Euler’s equation to determine an extremum value has given the obvious 
answer. It has been presented here because it provides a proof that a straight line is the shortest distance in 
a plane and illustrates the power of the calculus of variations to determine extremum paths. 


5.2 Example: Brachistochrone problem 


The Brachistochrone problem involves finding the path having the minimum transit time between two 
points. The Brachistochrone problem stimulated the development of the calculus of variations by John 
Bernoulli and Euler. For simplicity, take the case of frictionless motion in the x — y plane with a uni- 
form gravitational field acting in the y direction, as shown in the adjacent figure. The question is what 
constrained path will result in the minimum transit time between two points (x 1y1) and (u2ya). 
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Consider that the particle of mass m starts at the origin xı = 0,y1 = 0 with zero velocity. Since the 
problem conserves energy and assuming that initially E = KE + PE =0 then 


2 
= = =0 
ge mgy 

That is 


v= y 29y 


The transit time is given by 


; i ds i y da? + dy? y (+2), 
p< — = — —— Y 
a 0 Ja v29Y zı 2gy 
where x! = T Note that, in this example, the independent variable has been chosen to be y and the dependent 


variable is x(y). 
The function f of the integral is 


1 (+?) 
v29 y 


Factor out the constant y/2g term, which does not affect the final equation, and note that 


f 


of 
wes JE eG 
Ox 
DE y de 
Or 
i y (1+ (2?) 
Therefore Euler's equation gives 
d f 
0+ BE =0 
“A y/y(at (07) 
or (x,> yı) 
x! 1 
= constant = —= 


v2a 


That is a 
= ad Cycloid 
y (1 4 (2)?) 2a 
This may be rewritten as 
L ydy y 
t= SE The Bachistochrone problem involves finding the path for 
Day — y2 
m PATEN the minimum transit time for constrained frictionless 
Change the variable to y = a(l — cos0) gives motion in a uniform gravitational field. 


that dy = asin 0d0, leading to the integral 


v= f a(1—cos6) a 


or 
x = a(0 — sin 0) + constant 
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The parametric equations for a cycloid passing through the origin are 


x = a(@—siné) 
y = a(l-cosó6) 


which is the form of the solution found. That is, the shortest time between two points is obtained by con- 
straining the motion of the mass to follow a cycloid shape. Thus the mass first accelerates rapidly by falling 
down steeply and then follows the curve and coasts upward at the end. The elapsed time is obtained by 
inserting the above parametric relations for x and y, in terms of 0, into the transit time integral giving 


t= (38 where a and 0 are fixed by the end point coordinates. Thus the time to fall from starting with zero 
velocity at the cusp to the minimum of the cycloid is mfg. If ya = yı = 0 then x2 = 2ra which defines the 


shape of the cycloid and the minimum time is 2r 7 = y. If the mass starts with a non-zero initial 


velocity, then the starting point is not at the cusp of the cycloid, but down a distance d such that the kinetic 
energy equals the potential energy difference from the cusp. 

A modern application of the Brachistochrone problem is determination of the optimum shape of the low- 
friction emergency chute that passengers slide down to evacuate a burning aircraft. Bernoulli solved the 
problem of rapid evacuation of an aircraft two centuries before the first flight of a powered aircraft. 


5.3 Example: Minimal travel cost 


K 


Assume that the cost of flying an aircraft at height z is e~** per unit distance of flight-path, where « is a 
positive constant. Consider that the aircraft flies in the (x, z)-plane from the point (—a, 0) to the point (a,0) 
where z = 0 corresponds to ground level, and where the z-axis points vertically upwards. Find the extremal 
for the problem of minimizing the total cost of the journey. 

The differential arc-length element of the flight path ds can be written as 


ds = Vda? + dz2 = V1 + 22dx 


where z’ = E. Thus the cost integral to be minimized is 
+a +a 
C= e ds = e "7y 1 + 2z°dr 
= —a 


The function of this integral is 
f=e"y1+2? 
The partial differentials required for the Euler equations are 
d Of zle"? kz! e7" gl gle" 


dð?  NIER VIF? (14227 
of = =ke "y1 +2? 


Oz 


Therefore Euler’s equation equals 


fa) dð glee 12. 5—-KZ I N25 KZ 
f D= ke “*/1 + 2? z 5+ =+——_, =0 
Oz dx dz vi+z? yl+2 (1+ 2’) 
This can be simplified by multiplying the radical to give 
k Kz! kz! z! zl? + kz? + kzt E 2 y? =0 


Cancelling terms gives 
2" +k(1+2%) =0 
Separating the variables leads to 


dz! 
arctan z! = f at - f saz = —k2+C1 
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Integration gives 


cos(ci—K2@) 
In(cos(c, — Ka)) — In(cos(c, + Ka)) In (estero) 


2(1) =f dz s tan(cı — 0 dr = <A tir = HE 


K K 


Using the initial condition that z(—a) = 0 gives cz = 0. Similarly the final condition z(a) = 0 implies that 
cı =0. Thus Euler’s equation has determined that the optimal trajectory that minimizes the cost integral C 


aie a (=) 


K cos(Ka) 


This example is typical of problems encountered in economics. 


5.4 Selection of the independent variable 


A wide selection of variables can be chosen as the independent variable for variational calculus. The derivation 
of Euler’s equation and example 5.1 both assumed that the independent variable is x, whereas example 
5.2 used y as the independent variable, example 5.3 used z, and Lagrange mechanics uses time t as the 
independent variable. Selection of which variable to use as the independent variable does not change the 
physics of a problem, but some selections can simplify the mathematics for obtaining an analytic solution. 
The following example of a cylindrically-symmetric soap-bubble surface formed by blowing a soap bubble that 
stretches between two circular hoops, illustrates the importance when selecting the independent variable. 


5.4 Example: Surface area of a cylindrically-symmetric soap bubble 


Consider a cylindrically-symmetric soap-bubble surface 
formed by blowing a soap bubble that stretches between two 
circular hoops. The surface energy, that results from the sur- Z 
face tension of the soap bubble, is minimized when the surface 
area of the bubble is minimized. Assume that the axes of the 
two hoops lie along the z axis as shown in the adjacent figure. 
It is intuitively obvious that the soap bubble having the mini- 
mum surface area that is bounded by the two hoops will have 
a circular cross section that is concentric with the symmetry 
axis, and the radius will be smaller between the two hoops. 
Therefore, intuition can be used to simplify the problem to 
finding the shape of the contour of revolution around the axis 
of symmetry that defines the shape of the surface of minimum 
surface area. Use cylindrical coordinates (p,0,z) and assume 
that hoop 1 at zı has radius pı and hoop 2 at z2 has radius 
Po. Consider the cases where either p, or z, are selected to 
be the independent variable. 


Cylindrically-symmetric surface formed by 


The differential arc-length element of the circular annu-  "otation about the z axis of a soap bubble 
lus at constant 0 between z and z + dz is given by ds = suspended between two identical hoops 
\/dz? +dp?. Therefore the area of the infinitessimal circular centred on the z axis. 
annulus is dS = 2rpds which can be integrated to give the 
area of the surface S of the soap bubble bounded by the two 
circular hoops as 


2 
s= | pv dz? + dp? 
1 


Independent variable z 


Assuming that z is the independent variable, then the surface area can be written as 


2 2 2 
s=2r | p 1+ (2) de =2n f pv\/1 + p2dz 
1 2 1 
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where p' = de. The function of the surface integral is f =py1+ p°. The derivatives are 


and 


Therefore Euler’s equation gives 


This is not an easy equation to solve. 


Independent variable p 


Ja 


of 

Joo 1 12, 

Op ve 
Of pe 
°F f+? 


Consider the case where the independent variable is chosen to be p, then the surface integral can be written 


2 | 2 
S=% | p 1+ (5) dp =2n | oV TF zap 
1 P 


where z' = Z. Thus the function of the surface integral is f = pv 1 + 2/2. The derivatives are 


as 


and 


Therefore Euler’s equation gives 


That is 


or 


The integral of this is 


That is 


Of _ 
a. 0 
of —_ pz 
Oz! Le (7 
/ 
P 1+ (2)? 
pz 
=a 
14+ (2) 


—b 
p = acosh = 


which is the equation of a catenary. The catenary is the shape of a uniform flexible cable hung in a uniform 
gravitational field. The constants a and b are given by the end points. The physics of the solution must be 
identical for either choice of independent variable. However, mathematically one case is easier to solve than 
the other because, in the latter case, one term in Euler’s equation is zero. 
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5.5 Functions with several independent variables y;() 
The discussion has focussed on systems having only a single function y(x) such that the functional is an 


extremum. It is more common to have a functional that is dependent upon several independent variables 
f lyr (x), y (©), yo(), yo (x), ....; £] which can be written as 


x2 N 
F= [Ys tyslo),vila)sa de (5.16) 


where i = 1,2,3,...., N. 
By analogy with the one dimensional problem, define neighboring functions 7, for each variable. Then 


dyi(e,1) _ dyi(0,1) dn; 
1 = ’ a ) 2 
A e a ea 


where 7, are independent functions of x that vanish at x; and x2. Using equations 5.12 and 5.17 leads to 
the requirements for an extremum value to be 


de al = En de * Oy] de ) iem [ 2 (54 = 07) ni(w)da = 0 (5.18) 


1 


If the variables y;(x) are independent, then the 7,(x) are independent. Since the n, (a) are independent, 
then evaluating the above equation at e = 0 implies that each term in the bracket must vanish independently. 
That is, Euler’s differential equation becomes a set of N equations for the N independent variables 


Of dəf 


AO E 1 
Oy, dx Oy! a 


where 7 = 1,2,3..N. Thus, each of the N equations can be solved independently when the N variables are 
independent. Note that Euler’s equation involves partial derivatives for the dependent variables y; , y; and 
the total derivative for the independent variable x. 


5.5 Example: Fermat’s Principle 


In 1662 Fermat’s proposed that the propagation of 

light obeyed the generalized principle of least transit time. 
In optics, Fermat’s principle, or the principle of least (,y, 0% 
time, is the principle that the path taken between two 
points by a ray of light is the path that can be traversed in 
the least time. Historically, the proof of Fermat’s princi- 
ple by Johann Bernoulli was one of the first triumphs of 
the calculus of variations, and served as a guiding princi- 
ple in the formulation of physical laws using variational 
calculus. 

Consider the geometry shown in the figure, where 
the light travels from the point P,(0,y1,0) to the point 
Palxa, —Y2,0). The light beam intersects a plane glass 
interface at the point Q(x, 0, z). 

The French mathematician Fermat discovered that 
the required path travelled by light is the path for which 
the travel time t is a minimum. That is, the transit time from the initial point Pı to the final point Pa is 


given by 
2 2 2 2 
d 1 1 
t= | a= f ==2/ nds =~ f n(x, Y, z) 1 +(x’)? + (2) dy 
1 1 Y c Ji c Ji 


assuming that the velocity of light in any medium is given by v = c/n where n is the refractive index of the 
medium and c is the velocity of light in vacuum. 


(x, 0, z) 


P (x,, Y> 0) 
2 


Light incident upon a plane glass interface in the 
(x,y) plane at y = 0. 
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This is a problem that has two dependent variables x(y) and z(y) with y chosen as the independent 
variable. The integral can be broken into two parts yı —> 0 and 0 > —y2. 


oo Tears [mV TF 


The functionals are functions of x’ and z' but not x or z. Thus Euler’s equation for z simplifies to 


d (1 nız’ ngoz’ 
0+ ( ( L + 2 l )) = 
dy \e v1 +r? +z? Kise? +z? 
This implies that z' = 0, therefore z is a constant. Since the initial and final values were chosen to be 
zı = 22 = 0, therefore at the interface z = 0. Similarly Euler’s equations for x are 


d (1 nya! nat! 
A O 
dy \c yl+2x?4 2? 142242? 


But x' =tan@, for nı and x' = — tan 02 for nz and it was shown that 2” =0. Thus 


+2 a Ny tan 01 na tan bə = d 
TE (tan 01) z iE (tan 02)” dy 


Therefore t(n sin ĝı — n2 sin 02) = constant which must be zero since when nı = na, then 01 = 02. Thus 


Fermat’s principle leads to Snell’s Law. 


1 
( (nı sin 01 — Na sing») =0 
C 


nı sin @, = na sin 0 


The geometry of this problem is simple enough to directly minimize the path rather than using Euler’s 
equations for the two parameters as performed above. The lengths of the paths P,Q and QP are 


PQ = yty t? 
QP, = (z2 — £) +43 + 2? 


The total transit time is given by 


1 
t= 2 (myer Rtn (a) +h +2) 


This problem involves two dependent variables, y(x) and z(x). To find the minima, set the partial derivatives 
2 = 0 and Æ =0. That is, 


Ot 1 n12 nəz 


TE pre il 2 
oe lees (12 — a) + ya + 2? 


This is zero only if z =Q, that is the point Q lies in the plane containing P) and Pa. Similarly 


)=0 


Ot 1 La — 2) 


=> 
ee ra, +2 \/ (£2 — a)? + y? a 


This is zero only if Snell’s law applies that is 


= = (nı sin; — nasin02) = 0 


Ny sin 6; = Na sin 62 


Fermat’s principle has shown that the refracted light is given by Snell’s Law, and is in a plane normal to the 
surface. The laws of reflection also are given since then nı = ng = n and the angle of reflection equals the 
angle of incidence. 
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5.6 Example: Minimum of (Vo)? in a volume 


Find the function (£1, £2, 23) that has the minimum value of (Vo)? per unit volume. For the volume 
V it is desired to minimize the following 


-1 f f [over imanit f | MEDE (EY (32) | ene 


Note that the variables £1, £2, £3 are independent, and thus Euler’s equation for several independent variables 
can be used. To minimize the functional J, the function 


T (2) (22) o 


must satisfy the Euler equation 


where $! = ge. Substitute f into Euler’s equation gives 


Yam (an) 


V?=0 


Therefore $ must satisfy Laplace’s equation in order that the functional J be a minimum. 


This is just Laplace’s equation 


5.6 Euler’s integral equation 


An integral form of the Euler differential equation can be written which is useful for cases when the function 
f does not depend explicitly on the independent variable x, that is, when af = 0. Note that 


A. OF 430) OS 


dx Ox Oydx $ Oy! dx (orab) 
oo TAS eee 
/ R Y / 
(o 75) Oy! dx Y dx Oy! (at 


Combining these two equations gives 


d(,of\ d of ðf of 
= (u H-4 Ay Yay” E eae 


The last two terms can be rewritten as 


d Of of 
arca EEA 5.23 
i (a Oy! a) we 
which vanishes when the Euler equation is satisfied. Therefore the above equation simplifies to 
of d ¡Of 
=0 5.24 
Ox dx (s ¥ Oy’ eee) 


This integral form of Euler’s equation is especially useful when al = 0, that is, when f does not depend 
explicitly on the independent variable x. Then the first integral of equation 5.24 is a constant, i.e. 


o 
f- Tai = constant (5.25) 
Oy’ 
This is Euler’s integral variational equation. Note that the shortest distance between two points, the mini- 
mum surface of rotation, and the brachistochrone, described earlier, all are examples where ot = 0 and thus 
the integral form of Euler’s equation is useful for solving these cases. 


122 CHAPTER 5. CALCULUS OF VARIATIONS 


5.7 Constrained variational systems 
Imposing a constraint on a variational system implies: 


1. The N constrained coordinates y;(x) are correlated which violates 
the assumption made in chapter 5.5 that the N variables are inde- 
pendent. 


2. Constrained motion implies that constraint forces must be acting 
to account for the correlation of the variables. These constraint 
forces must be taken into account in the equations of motion. 


For example, for a disk rolling down an inclined plane without slip- 
ping, there are three coordinates x [perpendicular to the wedge], y, [Along 
the surface of the wedge], and the rotation angle 0 shown in figure 5.2. 
The constraint forces, Fs N, lead to the correlation of the variables such 
that x = R, while y = RO. Basically there is only one independent 
variable, which can be either y or 0. The use of only one independent 
variable essentially buries the constraint forces under the rug, which is 
fine if you only need to know the equation of motion. If you need to determine the forces of constraint then 
it is necessary to include all coordinates explicitly in the equations of motion as discussed below. 


Figure 5.2: A disk rolling down 
an inclined plane. 


5.7.1 Holonomic constraints 


Most systems involve restrictions or constraints that couple the coordinates. For example, the y;(x) may 
be confined to a surface in coordinate space. The constraints mean that the coordinates y; (a) are not inde- 
pendent, but are related by equations of constraint. A constraint is called holonomic if the equations of 
constraint can be expressed in the form of an algebraic equation that directly and unambiguously specifies 
the shape of the surface of constraint. A non-holonomic constraint does not provide an algebraic relation 
between the correlated coordinates. In addition to the holonomy of the constraints, the equations of con- 
straint also can be grouped into the following three classifications depending on whether they are algebraic, 
differential, or integral. These three classifications for the constraints exhibit different holonomy relating the 
coupled coordinates. Fortunately the solution of constrained systems is greatly simplified if the equations of 
constraint are holonomic. 


5.7.2 Geometric (algebraic) equations of constraint 


Geometric constraints can be expressed in the form of algebraic relations that directly specify the shape of 
the surface of constraint in coordinate space q1, q2,--- 4j, --4n- 


Gk (91, 92, --Qj, qn; t) = 0 (5.26) 


where j = 1,2,3,...n. There can be m such equations of constraint where 0 < k < m. An example of such a 
geometric constraint is when the motion is confined to the surface of a sphere of radius R in coordinate space 
which can be written in the form g = £? + y? + 2? — R? = 0. Such algebraic constraint equations are called 
Holonomic which allows use of generalized coordinates as well as Lagrange multipliers to handle both the 
constraint forces and the correlation of the coordinates. 


5.7.3 Kinematic (differential) equations of constraint 


The m constraint equations also can be expressed in terms of the infinitessimal displacements of the form 


09 
Y das + ~*d 5.27 
aq; qj at t=0 ( ) 


j=1 


where k = 1,2,3,...m, j = 1,2,3,...n. If equation (5.27) represents the total differential of a function then 
it can be integrated to give a holonomic relation of the form of equation 5.26. However, if equation 5.27 is 
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not the total differential, then it is non-holonomic and can be integrated only after having solved the full 
problem. 


An example of differential constraint equations is for a wheel rolling on a plane without slipping which is 
non-holonomic and more complicated than might be expected. The wheel moving on a plane has five degrees 
of freedom since the height z is fixed. That is, the motion of the center of mass requires two coordinates 
(x, y) plus there are three angles (6, 0, Y) where ¢ is the rotation angle for the wheel, 0 is the pivot angle of 
the axis, and w is the tilt angle of the wheel. If the wheel slides then all five degrees of freedom are active. 
If the axis of rotation of the wheel is horizontal, that is, the tilt angle Y = 0 is constant, then this kinematic 
system leads to three differential constraint equations The wheel can roll with angular velocity d, as well as 
pivot which corresponds to a change in 0. Combining these leads to two differential equations of constraint 


dz — asin 6d¢ = 0 dy + a cos 0d¢ = 0 (5.28) 


These constraints are insufficient to provide finite relations between all the coordinates. That is, the con- 
straints cannot be reduced by integration to the form of equation 5.26 because there is no functional relation 
between ¢ and the other three variables, x, y, 0. Many rolling trajectories are possible between any two points 
of contact on the plane that are related to different pivot angles. That is, the point of contact of the disk 
could pivot plus roll in a circle returning to the same point where x,y, are unchanged whereas the value 
of @ depends on the circumference of the circle. As a consequence the rolling constraint is non-holonomic 
except for the case where the disk rolls in a straight line and remains vertical. 


5.7.4 Isoperimetric (integral) equations of constraint 


Equations of constraint also can be expressed in terms of direct integrals. This situation is encountered for 
isoperimetric problems, such as finding the maximum volume bounded by a surface of fixed area, or the 
shape of a hanging rope of fixed length. Integral constraints occur in economics when minimizing some cost 
algorithm subject to a fixed total cost constraint. 

A simple example of an isoperimetric problem involves finding the curve y = y(x) such that the functional 
has an extremum where the curve y(x) satisfies boundary conditions such that y(11) = a and y(x2) = b, 
that is 


Fly) = Wc fy, y's )dax (5.29) 


is an extremum such that the perimeter also is constrained to satisfy 


12 
am =f otuatsa)de =1 (5.30) 
21 
where l is a fixed length. This integral constraint is geometric and holonomic. Another example is finding 
the minimum surface area of a closed surface subject to the enclosed volume being the constraint. 


5.7.5 Properties of the constraint equations 


Holonomic constraints Geometric constraints can be expressed in the form of an algebraic equation 
that directly specifies the shape of the surface of constraint 


g(Y1, Ya, Y3; +5 £) =0 (5.31) 


Such a system is called holonomic since there is a direct relation between the coupled variables. An example 
of such a holonomic geometric constraint is if the motion is confined to the surface of a sphere of radius R 
which can be written in the form 

gar’ +y +2 -R =0 (5.32) 


Non-holonomic constraints There are many classifications of non-holonomic constraints that exist 
if equation (5.31) is not satisfied. The algebraic approach is difficult to handle when the constraint is an 
inequality, such as the requirement that the location is restricted to lie inside a spherical shell of radius R 
which can be expressed as 

ger +y +2 -R <0 (5.33) 
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This non-holonomic constrained system has a one-sided constraint. Systems usually are non-holonomic if 
the constraint is kinematic as discussed above. 


Partial Holonomic constraints Partial-holonomic constraints are holonomic for a restricted range 
of the constraint surface in coordinate space, and this range can be case specific. This can occur if the 
constraint force is one-sided and perpendicular to the path. An example is the pendulum with the mass 
attached to the fulcrum by a flexible string that provides tension but not compression. Then the pendulum 
length is constant only if the tension in the string is positive. Thus the pendulum will be holonomic if 
the gravitational plus centrifugal forces are such that the tension in the string is positive, but the system 
becomes non-hononomic if the tension is negative as can happen when the pendulum rotates to an upright 
angle where the centrifugal force outwards is insufficient to compensate for the vertical downward component 
of the gravitational force. There are many other examples where the motion of an object is holonomic when 
the object is pressed against the constraint surface, such as the surface of the Earth, but is unconstrained if 
the object leaves the surface. 


Time dependence 


A constraint is called scleronomic if the constraint is not explicitly time dependent. This ignores the time 
dependence contained within the solution of the equations of motion. Fortunately a major fraction of 
systems are scleronomic. The constraint is called rheonomic if the constraint is explicitly time dependent. 
An example of a rheonomic system is where the size or shape of the surface of constraint is explicitly time 
dependent such as a deflating pneumatic tire. 


Energy conservation 


The solution depends on whether the constraint is conservative or dissipative, that is, if friction or drag are 
acting. The system will be conservative if there are no drag forces, and the constraint forces are perpendicular 
to the trajectory of the path such as the motion of a charged particle in a magnetic field. Forces of constraint 
can result from sliding of two solid surfaces, rolling of solid objects, fluid flow in a liquid or gas, or result from 
electromagnetic forces. Energy dissipation can result from friction, drag in a fluid or gas, or finite resistance 
of electric conductors leading to dissipation of induced electric currents in a conductor, e.g. eddy currents. 

A rolling constraint is unusual in that friction between the rolling bodies is necessary to maintain rolling. 
A disk on a frictionless inclined plane will conserve it’s angular momentum since there is no torque acting 
if the rolling contact is frictionless, that is, the disk will just slide. If the friction is sufficient to stop sliding, 
then the bodies will roll and not slide. A perfect rolling body does not dissipate energy since no work is 
done at the instantaneous point of contact where both bodies are in zero relative motion and the force is 
perpendicular to the motion. In real life, a rolling wheel can involve a very small energy dissipation due to 
deformation at the point of contact coupled with non-elastic properties of the material used to make the 
wheel and the plane surface. For example, a pneumatic tire can heat up and expand due to flexing of the 
tire. 


5.7.6 Treatment of constraint forces in variational calculus 


There are three major approaches to handle constraint forces in variational calculus. All three of them exploit 
the tremendous freedom and flexibility available when using generalized coordinates. The (1) generalized 
coordinate approach, described in chapter 5.8, exploits the correlation of the n coordinates due to the m 
constraint forces to reduce the dimension of the equations of motion to s = n — m degrees of freedom. This 
approach embeds the m constraint forces, into the choice of generalized coordinates and does not determine 
the constraint forces, (2) Lagrange multiplier approach, described in chapter 5.9, exploits generalized 
coordinates but includes the m constraint forces into the Euler equations to determine both the constraint 
forces in addition to the n equations of motion. (3) Generalized forces approach, described in chapter 
6.7.3, introduces constraint and other forces explicitly. 
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5.8 Generalized coordinates in variational calculus 


Newtonian mechanics is based on a vectorial treatment of mechanics which can be difficult to apply when 
solving complicated problems in mechanics. Constraint forces acting on a system usually are unknown. In 
Newtonian mechanics constrained forces must be included explicitly so that they can be determined simul- 
taneously with the solution of the dynamical equations of motion. The major advantage of the variational 
approaches is that solution of the dynamical equations of motion can be simplified by expressing the motion 
in terms of n independent generalized coordinates. These generalized coordinates can be any set of in- 
dependent variables, g;, where 1 < i < n, plus the corresponding velocities q, for Lagrangian mechanics, 
or the corresponding canonical variables, q;, p; for Hamiltonian mechanics. These generalized coordinates for 
the n variables are used to specify the scalar functional dependence on these generalized coordinates. The 
variational approach employs this scalar functional to determine the trajectory. The generalized coordinates 
used for the variational approach do not need to be orthogonal, they only need to be independent since they 
are used only to completely specify the magnitude of the scalar functional. This greatly expands the arse- 
nal of possible generalized coordinates beyond what is available using Newtonian mechanics. For example, 
generalized coordinates can be the dimensionless amplitudes for the n normal modes of coupled oscillator 
systems, or action-angle variables. In addition, generalized coordinates having different dimensions can be 
used for each of the n variables. Each generalized coordinate, q; specifies an independent mode of the system, 
not a specific particle. For example, each normal mode of coupled oscillators can involve correlated motion of 
several coupled particles. The major advantage of using generalized coordinates is that they can be chosen 
to be perpendicular to a corresponding constraint force, and therefore that specific constraint force does no 
work for motion along that generalized coordinate. Moreover, the constrained motion does no work in the 
direction of the constraint force for rigid constraints. Thus generalized coordinates allow specific constraint 
forces to be ignored in evaluation of the minimized functional. This freedom and flexibility of choice of gen- 
eralized coordinates allows the correlated motion produced by the constraint forces to be embedded directly 
into the choice of the independent generalized coordinates, and the actual constraint forces can be ignored. 
Embedding of the constraint induced correlations into the generalized coordinates, effectively “sweeps the 
constraint forces under the rug” which greatly simplifies the equations of motion for any system that in- 
volve constraint forces. Selection of the appropriate generalized coordinates can be obvious, and often it is 
performed subconsciously by the user. 

Three variational approaches are used that employ generalized coordinates to derive the equations of 
motion of a system that has n generalized coordinates subject to m constraints. 

1) Minimal set of generalized coordinates: When the m equations of constraint are holonomic, then 
the m algebraic constraint relations can be used to transform the coordinates into s = n — m independent 
generalized coordinates q;. This approach reduces the number of unknowns, n, by the number of constraints 
m, to give a minimal set of s = n — m independent generalized dynamical variables. The forces of constraint 
are not explicitly discussed, or determined, when this generalized coordinate approach is employed. This 
approach greatly simplifies solution of dynamical problems by avoiding the need for explicit treatment of the 
constraint forces. This approach is straight forward for holonomic constraints, since the n spatial coordinates 
yi(x),...yn (x), are coupled by m algebraic equations which can be used to make the transformation to 
generalized coordinates. Thus the n coupled spatial coordinates are transformed to s = n — m independent 
generalized dynamical coordinates qı (£), ....qs(x), and their generalized first derivatives q1 (1), ....d;(z). These 
generalized coordinates are independent, and thus it is possible to use Euler’s equation for each independent 
parameter q; 


a = (5.34) 


where i = 1,2,3..s. There are s = n—m such Euler equations. The freedom to choose generalized coordinates 
underlies the tremendous advantage of applying the variational approach. 

2) Lagrange multipliers: The n Lagrange equations, plus the m equations of constraint, can be used 
to explicitly determine the n generalized coordinates plus the m constraint forces. That is, n +m unknowns 
are determined. This approach is discussed in chapter 5.9. 

3) Generalized forces: This approach introduces the constraint forces explicity. This approach, applied 
to Lagrangian mechanics, is discussed in chapter 6.6.3. 

The above three approaches exploit generalized coordinates to handle constraint forces as described in 
chapter 6. 
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5.9 Lagrange multipliers for holonomic constraints 


5.9.1 Algebraic equations of constraint 


The Lagrange multiplier technique provides a powerful, and elegant, way to handle holonomic constraints 
using Euler’s equations!. The general method of Lagrange multipliers for n variables, with m constraints, 
is best introduced using Bernoulli’s ingenious exploitation of virtual infinitessimal displacements, which 
Lagrange signified by the symbol 6. The term “virtual” refers to an intentional variation of the generalized 
coordinates ôq; in order to elucidate the local sensitivity of a function F(q;,w) to variation of the variable. 
Contrary to the usual infinitessimal interval in differential calculus, where an actual displacement dq; occurs 
during a time dt, a virtual displacement is imagined to be an instantaneous, infinitessimal, displacement of 
a coordinate, not an actual displacement, in order to elucidate the local dependence of F on the coordinate. 
The local dependence of any functional F, to virtual displacements of all n coordinates, is given by taking 
the partial differentials of F. 


“OF 
ôF = > agi” (5.35) 


The function F is stationary, that is an extremum, if equation 5.35 equals zero. The extremum of the 
functional F, given by equation 5.16, can be expressed in a compact form using the virtual displacement 
formalism as 


2 


ôF = sf DF lale) ale); a] dx = ; or sq =0 (5.36) 


The auxiliary conditions, due to the m holonomic algebraic constraints for the n variables q;, can be 
expressed by the m equations 
g(a) = 0 (5.37) 


where 1 < k < m and 1 < i < n with m < n. The variational problem for the m holonomic constraint 
equations also can be written in terms of m differential equations where 1 < k < m 


n 


aN y (5.38) 
i=] OG: 


Since equations 5.36 and 5.38 both equal zero, the m equations 5.38 can be multiplied by arbitrary 
undetermined factors Az, and added to equations 5.36 to give. 


OF (qi, £) + A1091 + A2892 : :Ax99x ` -AmOGm = 0 (5.39) 


Note that this is not trivial in that although the sum of the constraint equations for each y; is zero; the 
individual terms of the sum are not zero. 
Insert equations 5.36 plus 5.38 into 5.39, and collect all n terms, gives 


z OF mn 09x 

— A 04; = 5.40 
DES je) a (540) 
Note that all the ôq; are free independent variations and thus the terms in the brackets, which are the 


coefficients of each 6q;, individually must equal zero. For each of the n values of i, the corresponding bracket 
implies 


+ A =0 5.41 
Ogi 2 : Ogi ( ) 
This is equivalent to what would be obtained from the variational principle 
SF +Y Andon =0 (5.42) 
k=1 


1 This textbook uses the symbol q; to designate a generalized coordinate, and q; to designate the corresponding first derivative 
with respect to the independent variable, in order to differentiate the spatial coordinates from the more powerful generalized 
coordinates. 
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Equation 5.42 is equivalent to a variational problem for finding the stationary value of F” 
6(F’) =ô (« +5 a) =0 (5.43) 
k 


where F” is defined to be 


m 
F'= (« £ 5 ma) (5.44) 
k=1 
The solution to equation 5.43 can be found using Euler’s differential equation 5.19 of variational calculus. 
At the extremum 6 (F”) = 0 corresponds to following contours of constant F” which are in the surface that is 
perpendicular to the gradients of the terms in F”. The Lagrange multiplier constants are required because, 
although these gradients are parallel at the extremum, the magnitudes of the gradients are not equal. 

The beauty of the Lagrange multipliers approach is that the auxiliary conditions do not have to be 
handled explicitly, since they are handled automatically as m additional free variables during solution of 
Euler’s equations for a variational problem with n +m unknowns fit to n + m equations. That is, the n 
variables q; are determined by the variational procedure using the n variational equations 


de Bah? da)” ae ag? Ca a (5.45) 


simultaneously with the m variables Az which are determined by the m variational equations 


d OF. OF 
dx” Oy 


)=0 (5.46) 


Equation 5.45 usually is expressed as 


OF d OF Y 22 


(aq) = da oq) "Oa, 7 0 (5.47) 


The elegance of Lagrange multipliers is that a single variational approach allows simultaneous determination 
of all n+m unknowns. Chapter 6.2 shows that the forces of constraint are given directly by the Ayo Got terms. 


5.7 Example: Two dependent variables coupled by one holonomic constraint 


The powerful, and generally applicable, Lagrange multiplier technique is illustrated by considering the case 
of only two dependent variables, y(x), and z (x), with the function f(y(x), y (x), 2(x), 2(2);x) and with one 
holonomic equation of constraint coupling these two dependent variables. The extremum is given by requiring 


La 
Ll aay) et (Ze ede) 2: +=" a 

with the constraint expressed by the auxiliary condition 
gy, 252) =0 (B) 
Note that the variations du and gz are no longer independent because of the constraint equation, thus the 


the two terms in the brackets of equation A are not separately equal to zero at the extremum. However, 
differentiating the constraint equation B gives 
d Og ð Og Oz 
2g _|292Y _ 790Z \ o (C) 
de Oy ðe Oz ðe 
No ga term applies because, for the independent variable, dz = 0. Introduce the neighboring paths by adding 
the auxiliary functions 


WN 


yles) = y(x) +n, (z) (D 
lex) = 2(1)+en2(x) 


E 
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Insert the differentials of equations D and E, into C gives 


dg Og Og 
i een pens ad = F 
of = (Genle) + Fem(e)) =0 (F) 
implying that a 
og 
n(x) = -32m (2) 
Oz 


Equation A can be rewritten as 


z2 [/Əf 7b OF of d Of 
Laa) (E) ue 


1 


"10 d ð Ə d ðf \ 22 
| E =o) E aah) AD = 0 (G) 


1 


II 
o 


Equation G now contains only a single arbitrary function n(x) that is not restricted by the constraint. Thus 
the bracket in the integrand of equation G must equal zero for the extremum. That is 


EIN AR daf aN Me) 
Oy dx Oy’ Oy TNO ee Oz E Š 
Now the left-hand side of this equation is only a function of f and g with respect to y and y' while the 
right-hand side is a function of f and g with respect to z and z'. Because both sides are functions of x then 
each side can be set equal to a function —A(x). Thus the above equations can be written as 
d Of of Og d Of of Og 
— — — — Z> — — — — — Z> — H 
o a OG an a “a (E) 


The complete solution of the three unknown functions. y(x), z(x), and A(x). is obtained by solving the two 
equations, H, plus the equation of constraint F. The Lagrange multiplier A(x) is related to the force of 
constraint. This example of two variables coupled by one holonomic constraint conforms with the general 
relation for many variables and constraints given by equation 5.47. 


5.9.2 Integral equations of constraint 


The constraint equation also can be given in an integral form which is used frequently for isoperimetric 
problems. Consider a one dependent-variable isoperimetric problem, for finding the curve q = q(x) such that 
the functional has an extremum, and the curve q(x) satisfies boundary conditions such that q(x1) = a and 
q(x2) = b. That is 


Fo)= f "araa (5.48) 


is an extremum such that the fixed length / of the perimeter satisfies the integral constraint 


T2 


G(y) = f g(a,q';1)dx =1 (5.49) 


1 


Analogous to (5.44) these two functionals can be combined requiring that 
ÔK (q, x, A) =9[F(q) + AG(q)] = 5 | [f + Agldx = 0 (5.50) 


That is, it is an extremum for both q(x) and the Lagrange multiplier A. This effectively involves finding the 
extremum path for the function K(q,x,A) = F(q,x) + AG(q, x) where both q(x) and A are the minimized 
variables. Therefore the curve q(x) must satisfy the differential equation 


oe oe 24 _ 2) 


dx Oq ðq da 


51 
dz ðq; OG (ES 
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subject to the boundary conditions q(11)= a, q(x2) = b, and G(q) = l. 


5.8 Example: Catenary 


One isoperimetric problem is the catenary which is the shape a uniform rope or chain of fixed length l 
that minimizes the gravitational potential energy. Let the rope have a uniform mass per unit length of o 


kg/m. 
The gravitational potential energy is 


2 2 2 
U= og | yds = og | yvy dx? + dy? = ag | yv 1+y?dx i i 
1 1 1 


The constraint is that the length be a constant l 


2 2 
a. d= f V1 + y2dx 
1 ak 


Thus the function is f(y,y';x) = yy 1+y”? while the integral con- 


straint sets g = y1 +y”? 
These need to be inserted into the Euler equation (5.51) by defining 


— = / 12 
FP=ftrAg=(ytAvity The catenary 
Note that this case is one where oe = 0 and X is a constant; also 


defining z = y + then z' = y'. Therefore the Euler’s equations can be written in the integral form 


OF 
F — z? — = c = constant 


Oz! 
Inserting the relation F = z/1+ z” gives 


22 
2V14 2? — ¿== =C 
v14+ 2 


where c is an arbitrary constant. This simplifies to 


The integral of this is 


z = ccosh (z TE >) 
c 


where b and c are arbitrary constants fixed by the locations of the two fixed ends of the rope. 


5.9 Example: The Queen Dido problem 


A famous constrained isoperimetric legend is that of Dido, first Queen of Carthage. Legend says that, 
when Dido landed in North Africa, she persuaded the local chief to sell her as much land as an oxhide could 
contain. She cut an orhide into narrow strips and joined them to make a continuous thread more than four 
kilometers in length which was sufficient to enclose the land adjoining the coast on which Carthage was built. 
Her problem was to enclose the maximum area for a given perimeter. Let us assume that the coast line is 
straight and the ends of the thread are at +a on the coast line. The enclosed area is given by 


+a 
A= f ydx 


The constraint equation is that the total perimeter equals l. 


f Y 1+y?dx =1 
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a we have pd the functional f(y,y’,x) = y and gly,y' £) = y1+y?. Then E 1, 35 0, z 0 


. Insert these into the Euler-Lagrange equation (5.51) gives 


a 12. 
d 1 
pA 
dx /1 E y? 
That is 
d y o1 
dx 4/1 + y? = A 
Integrate with respect to x gives 
Ay 


—b 


Vity? 


where b is a constant of integration. This can be rearranged to give 


+ (a — b) 


/ a 


A ee 
A? — (x —b) 


The integral of this is 


y= Fr? — (x - b)? +e 
Rearranging this gives 
(1d +y- => 


This is the equation of a circle centered at (b,c). Setting the bounds to be (—a,0) to (a,0) gives that 
b=c=0 and the circle radius is A. Thus the length of the thread must be l = TA. Assuming that | = 4km 
then X = 1.27km and Queen Dido could buy an area of 2.53km?. 


5.10 Geodesic 


The geodesic is defined as the shortest path between two fixed points for motion that is constrained to lie 
on a surface. Variational calculus provides a powerful approach for determining the equations of motion 
constrained to follow a geodesic. 

The use of variational calculus is illustrated by considering the geodesic constrained to follow the surface 
of a sphere of radius R. As discussed in appendix C.2.3, the element of path length on the surface of the 


sphere is given in spherical coordinates as ds = Ry/d6? + (sin bdo)’. Therefore the distance s between two 


points 1 and 2 is 
2| i/d 
s=R +sin? 0| do 5.52 
i (5) á 


The function f for ensuring that s be an extremum value uses 
f=v0" +sin? 0 (5.53) 


where 6’ = 2. This is a case where of = 0 and thus the integral form of Euler’s equation can be used 
leading to the result that 


V0? + sin? 6 — 6 Gry 0? + sin? 0 = constant = a (5.54) 


00 
This gives that 
sin? 9 = av 0? + sin? 6 (5.55) 
This can be rewritten as 
do 1 acsc0 


(5.56) 


do 0/1 =a? csc? 0 
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Solving for ¢ gives 


to 
Jman (A ) +a (5.57) 
where A 
l-a 
p= 2 (5.58) 
That is 
cot 0 = Bsin(¢ — a) (5.59) 
Expanding the sine and cotangent gives 
(8 cosa) Rsind sin $ — (8 sin a) Rsin 0 cos p = R cos 0 (5.60) 


Since the brackets are constants, this can be written as 
A(Rsin@sin ¢) — B(Rsin@ cos ¢) = (R cos 0) (5.61) 
The terms in the brackets are just expressions for the rectangular coordinates x, y, z. That is, 
Ay- Bu=z (5.62) 


This is the equation of a plane passing through the center of the sphere. Thus the geodesic on a sphere 
is the path where a plane through the center intersects the sphere as well as the initial and final locations. 
This geodesic is called a great circle. Euler’s equation gives both the maximum and minimum extremum 
path lengths for motion on this great circle. 

Chapter 17 discusses the geodesic in the four-dimensional space-time coordinates that underlie the General 
Theory of Relativity. As a consequence, the use of the calculus of variations to determine the equations of 
motion for geodesics plays a pivotal role in the General Theory of Relativity. 


5.11 Variational approach to classical mechanics 


This chapter has introduced the general principles of variational calculus needed for understanding the La- 
grangian and Hamiltonian approaches to classical mechanics. Although variational calculus was developed 
originally for classical mechanics, now it has grown to be an important branch of mathematics with applica- 
tions to many other fields outside of physics. The prologue of this book emphasized the dramatic differences 
between the differential vectorial approach of Newtonian mechanics, and the integral variational approaches 
of Lagrange and Hamiltonian mechanics. The Newtonian vectorial approach involves solving Newton’s dif- 
ferential equations of motion that relate the force and momenta vectors. This requires knowledge of the 
time dependence of all the force vectors, including constraint forces, acting on the system which can be very 
complicated. Chapter 2 showed that the first-order time integrals, equations 2.10, 2.16, relate the initial and 
final total momenta without requiring knowledge of the complicated instantaneous forces acting during the 
collision of two bodies. Similarly, for conservative systems, the first-order spatial integral, equation 2.21, 
relates the initial and final total energies to the net work done on the system without requiring knowledge 
of the instantaneous force vectors. The first-order spatial integral has the advantage that it is a scalar quan- 
tity, in contrast to time integrals which are vector quantities. These first-order integral relations are used 
frequently in Newtonian mechanics to derive solutions of the equations of motion that avoid having to solve 
complicated differential equations of motion. 

This chapter has illustrated that variational principles provide a means of deriving more detailed infor- 
mation, such as the trajectories for the motion between given initial and final conditions, by requiring that 
scalar functionals have extrema values. For example, the solution of the brachistochrone problem determined 
the trajectory having the minimum transit time, based on only the magnitudes of the kinetic and gravita- 
tional potential energies. Similarly, the catenary shape of a suspended chain was derived by minimizing the 
gravitational potential energy. The calculus of variations uses Euler’s equations to determine directly the 
differential equations of motion of the system that lead to the functional of interest being stationary at an 
extremum. The Lagrangian and Hamiltonian variational approaches to classical mechanics are discussed 
in chapters 6 — 16. The broad range of applicability, the flexibility, and the power provided by variational 
approaches to classical mechanics and modern physics will be illustrated. 
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5.12 Summary 


Euler’s differential equation: The calculus of variations has been introduced and Euler’s differential 
equation was derived. The calculus of variations reduces to varying the functions y;(x), where i = 1, 2,3, ...n, 
such that the integral 


T2 
F= | ud (5.16) 

Tı 
is an extremum, that is, it is a maximum or minimum. Here x is the independent variable, y;(x) are 
the dependent variables plus their first derivatives y; = dui, The quantity f [y(x), y (1); x] has some given 


dependence on y;, y; and x. The calculus of variations involves varying the functions y;(x) until a stationary 
value of F is found which is presumed to be an extremum. It was shown that if the y;(x) are independent, 
then the extremum value of F leads to n independent Euler equations 


DLN (5.19) 


where i = 1,2,3..n. This can be used to determine the functional form y;(x) that ensures that the integral 
F= J f [y(x), y (x); x] dz is a stationary value, that is, presumably a maximum or minimum value. 

Note that Euler’s equation involves partial derivatives for the dependent variables y;, y;, and the total 
derivative for the independent variable x. 

Euler’s integral equation: It was shown that if the function Jee Fly (0), y; (x); £] does not depend on 
the independent variable, then Euler’s differential equation can be written in an integral form. This integral 
form of Euler’s equation is especially useful when gt = 0, that is, when f does not depend explicitly on x, 
then the first integral of the Euler equation is a constant 


f- yt = constant (5.25) 
y 


Constrained variational systems: Most applications involve constraints on the motion. The equations 
of constraint can be classified according to whether the constraints are holonomic or non-holonomic, the time 
dependence of the constraints, and whether the constraint forces are conservative. 

Generalized coordinates in variational calculus: Independent generalized coordinates can be chosen 
that are perpendicular to the rigid constraint forces and therefore the constraint does not contribute to the 
functional being minimized. That is, the constraints are embedded into the generalized coordinates and thus 
the constraints can be ignored when deriving the variational solution. 

Minimal set of generalized coordinates: If the constraints are holonomic then the m holonomic 
equations of constraint can be used to transform the n coupled generalized coordinates to s = n — m 
independent generalized variables q;,q;. The generalized coordinate method then uses Euler’s equations to 
determine these s = n — m independent generalized coordinates. 


eet, iG (5.35) 


Lagrange multipliers for holonomic constraints: The Lagrange multipliers approach for n variables, 
plus m holonomic equations of constraint, determines all N +m unknowns for the system. The holonomic 
forces of constraint acting on the N variables, are related to the Lagrange multiplier terms Ae (a2) Sot that 
are introduced into the Euler equations. That is, 


ae a ae Ogu _ 
By dea +L OG, 0 ore 


where the holonomic equations of constraint are given by 
gr(yi;x) = 0 (5.38) 


The advantage of using the Lagrange multiplier approach is that the variational procedure simultaneously 
determines both the equations of motion for the N variables plus the m constraint forces acting on the 
system. 
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Workshop exercises 


1. Find the extremal of the functional 


that satisfies x(1) = 3 and x(2) = 18. Show that this extremal provides the global minimum of J. 


2. Consider the use of equations of constraint. 


(a) A particle is constrained to move on the surface of a sphere. What are the equations of constraint for this 
system? 

(b) A disk of mass m and radius R rolls without slipping on the outside surface of a half-cylinder of radius 
5R. What are the equations of constraint for this system? 

(c) What are holonomic constraints? Which of the equations of constraint that you found above are holo- 
nomic? 


(d) Equations of constraint that do not explicitly contain time are said to be scleronomic. Moving constraints 
are rheonomic. Are the equations of constraint that you found above scleronomic or rheonomic? 


3. For each of the following systems, describe the generalized coordinates that would work best. There may be 
more than one answer for each system. 


(a) An inclined plane of mass M is sliding on a smooth horizontal surface, while a particle of mass m is 
sliding on the smooth inclined surface. 


(b) A disk rolls without slipping across a horizontal plane. The plane of the disk remains vertical, but it is 
free to rotate about a vertical axis. 


(c) A double pendulum consisting of two simple pendula, with one pendulum suspended from the bob of the 
other. The two pendula have equal lengths and have bobs of equal mass. Both pendula are confined to 
move in the same plane. 


(d) A particle of mass m is constrained to move on a circle of radius R. The circle rotates in space about 
one point on the circle, which is fixed. The rotation takes place in the plane of the circle, with constant 
angular speed w, in the absence of a gravitational force. 


(e) A particle of mass m is attracted toward a given point by a force of magnitude k/r?, where k is a constant. 


4. Looking back at the systems in problem 3, which ones could have equations of constraint? How would you 
classify the equations of constraint (holonomic, scleronomic, rheonomic, etc.)? 
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Problems 


1. Find the extremal of the functional 


Ja) = fe sint — ¿2% dt 


that satisfies z(o) = (7) = 0. Show that this extremal provides the global maximum of J. 
22 
2. Find and describe the path y = y(x) for which the the integral fvz 1+ (yy de is stationary. 
zı 
3. Find the dimensions of the parallelepiped of maximum volume circumscribed by a sphere of radius R. 
4. Consider a single loop of the cycloid having a fixed value of a as shown in the figure. A car released from 


rest at any point Pp anywhere on the track between O and the lowest point P , that is, Py has a parameter 
0< o<. 


a) Show that the time T for the cart to slide from Po to P is given by the integral 
8 y 8 


a / 1 — cos 0 
T(P > P) = 
orp) EJ re ay ts 
90 


(b) Prove that this time T is equal to ry/a/g which is independent of the position Pp. 


(c) Explain qualitatively how this surprising result can possibly be true. 


5. Consider a medium for which the refractive index n = Foi where a is a constant and r is the distance from 
the origin. Use Fermat’s Principle to find the path of a ray of light travelling in a plane containing the origin. 
Hint, use two-dimensional polar coordinates with Y = ¢ (r). Show that the resulting path is a circle through 
the origin. 


6. Find the shortest path between the (x, y, z) points (0, —1,0) and (0, 1,0) on the conical surface 


2=1- yr? + y 
What is the length of this path? Note that this is the shortest mountain path around a volcano. 


7. Show that the geodesic on the surface of a right circular cylinder is a segment of a helix. 


Chapter 6 


Lagrangian dynamics 


6.1 Introduction 


Newtonian mechanics is based on vector observables such as momentum and force, and Newton's equations 
of motion can be derived if the forces are known. Newtonian mechanics becomes difficult to apply for many- 
body systems that involve constraint forces. The alternative algebraic Lagrangian mechanics approach is 
based on the concept of scalar energies which circumvent many of the difficulties in handling constraint forces 
and many-body systems. 

The Lagrangian approach to classical dynamics is based on the calculus of variations introduced in chapter 
5. It was shown that the calculus of variations determines the function y;(x) such that the scalar functional 


F= | Y Style), a)sel ae (6.1) 


is an extremum, that is, a maximum or minimum. Here x is the independent variable, y¿(1) are the n 
dependent variables, and their derivatives y; = ds, where i = 1,2,3,..n. The function f [y;(x), y;(x); £] has 
an assumed dependence on y;, y; and x. The calculus of variations determines the functional dependence 
of the dependent variables y;(x), on the independent variable x, that is needed to ensure that F is an 
extremum. For n independent variables, F has a stationary point, which is presumed to be an extremum, 


that is determined by solution of Euler’s differential equations 


2-2 =0 (6.2) 


If the coordinates y;(x) are independent, then the Euler equations, (6.2), for each coordinate i are inde- 
pendent. However, for constrained motion, the constraints lead to auxiliary conditions that correlate the 
coordinates. As shown in chapter 5, a transformation to independent generalized coordinates can be made 
such that the correlations induced by the constraint forces are embedded into the choice of the independent 
generalized coordinates. The use of generalized coordinates in Lagrangian mechanics simplifies derivation of 
the equations of motion for constrained systems. For example, for a system of n coordinates, that involves 
m holonomic constraints, there are s = n — m independent generalized coordinates. For such holonomic 
constrained motion, it will be shown that the Euler equations can be solved using either of the following 
three alternative ways. 

1) The minimal set of generalized coordinates approach involves finding a set of s = n—m indepen- 
dent generalized coordinates q; that satisfy the assumptions underlying (6.2). These generalized coordinates 
can be determined if the m equations of constraint are holonomic, that is, related by algebraic equations of 
constraint 

gn (Git) = 0 (6.3) 
where k = 1, 2,3,....m. These equations uniquely determine the relationship between the n correlated coordi- 
nates. This method has the advantage that it reduces the system of n coordinates, subject to m constraints, 
to s = n — m independent generalized coordinates which reduces the dimension of the problem to be solved. 
However, it does not explicitly determine the forces of constraint which are effectively swept under the rug. 
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2) The Lagrange multipliers approach takes account of the correlation between the n coordinates and 
m holonomic constraints by introducing the Lagrange multipliers Az(1). These n generalized coordinates q; 
are correlated by the m holonomic constraints. 


dof 0f_Y 09r 


where i =1,2,3,...n. The Lagrange multiplier approach has the advantage that Euler’s calculus of variations 
automatically use the n Lagrange equations, plus the m equations of constraint, to explicitly determine both 
the n coordinates q; and the m forces of constraint which are related to the Lagrange multipliers Az as given 
in equation (6.4). Chapter 6.2 shows that the y. Ap (£) 9g terms are directly related to the holonomic 
forces of constraint. 

3) The generalized force approach incorporates the forces of constraint explicitly as will be shown in 
chapter 6.5.4. Incorporating the constraint forces explicitly allows use of holonomic, non-holonomic, and 
non-conservative constraint forces. 

Understanding the Lagrange formulation of classical mechanics is facilitated by use of a simple non- 
rigorous plausibility approach that is based on Newton’s laws of motion. This introductory plausibility ap- 
proach will be followed by two more rigorous derivations of the Lagrangian formulation developed using either 
d’Alembert Principle or Hamiltons Principle. These better elucidate the physics underlying the Lagrange 
and Hamiltonian analytic representations of classical mechanics. In 1788 Lagrange derived his equations of 
motion using the differential d’Alembert Principle, that extends to dynamical systems the Bernoulli Principle 
of infinitessimal virtual displacements and virtual work. The other approach, developed in 1834, uses the 
integral Hamilton’s Principle to derive the Lagrange equations. Hamilton’s Principle is discussed in more 
detail in chapter 9. Euler’s variational calculus underlies d’Alembert’s Principle and Hamilton’s Principle 
since both are based on the philosophical belief that the laws of nature prefer economy of motion. Chap- 
ters 6.2 — 6.5 show that both d’Alembert’s Principle and Hamilton’s Principle lead to the Euler-Lagrange 
equations. This will be followed by a series of examples that illustrate the use of Lagrangian mechanics in 
classical mechanics. 


6.2 Newtonian plausibility argument for Lagrangian mechanics 


Insight into the physics underlying Lagrange mechanics is given by showing the direct relationship between 
Newtonian and Lagrangian mechanics. The variational approaches to classical mechanics exploit the first- 
order spatial integral of the force, equation 2.17, which equals the work done between the initial and final 
conditions. The work done is a simple scalar quantity that depends on the initial and final location for 
conservative forces. Newton’s equation of motion is 


TE (6.5) 
The kinetic energy is given by 
1 PP PE H É 
T S 2 Pa iat’ z 
ie 2m 2m 2m 2m 
It can be seen that ƏT 
7 = Pu 6.6 
a =P (6.6) 
e dôr d 
Da 
—— = — =F, 6.7 
dt ðt dt Oo) 


Consider that the force, acting on a mass m, is arbitrarily separated into two components, one part that 
is conservative, and thus can be written as the gradient of a scalar potential U, plus the excluded part of 
the force, FEX. The excluded part of the force FEX could include non-conservative frictional forces as well 
as forces of constraint which may be conservative or non-conservative. This separation allows the force to 


be written as 
F=-VU+F"* (6.8) 
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Along each of the x; axes, 


dtd; Ox, °° ™ 
Equation (6.9) can be extended by transforming the cartesian coordinate x; to the generalized coordinates 
qi. 
Define the standard Lagrangian to be the difference between the kinetic energy and the potential energy, 
which can be written in terms of the generalized coordinates q; as 


L(qi,åi) =T (di) = U (qi) (6.10) 


Assume that the potential is only a function of the generalized coordinates q;, that is oo = 0, then 


(6.9) 


OL oT . WU _ aT 


= — = 6.11 
OG On Oe OG em 

Using the above equations allows Newton’s equation of motion (6.9) to be expressed as 
d OL OL _ pex (6.12) 


dt0q, dq“ 


The excluded force FX can be partitioned into a holonomic constraint force FF“, plus any remaining 


FEXC 


excluded forces , as given by 


FEX EL + P*O (6.13) 
A comparison of equations (6.12, 6.13) and (6.4) shows that the holonomic constraint forces FJ C that are 


contained in the excluded force FPX, can be identified with the Lagrange multiplier term in equation 6.4. 


Å- 09% 
HC _ 
FË = > Ak Oda (6.14) 


That is the Lagrange multiplier terms can be used to account for holonomic constraint forces F H C. Thus 
equation 6.12 can be written as 


dOL OL 
dt 04; 09; 


m 0) 
= Y Me (1) E 4 FEXO (6.15) 
k 04; 
where the Lagrange multiplier term accounts for holonomic constraint forces, and F, ojo © includes all the 
remaining forces that are not accounted for by the scalar potential U, or the Lagrange multiplier terms ae Ca 
For holonomic, conservative forces it is possible to absorb all the forces into the potential U plus the 
Lagrange multiplier term, that is | C = 0. Moreover, the use of a minimal set of generalized coordinates 
allows the holonomic constraint forces to be ignored by explicitly reducing the number of coordinates from 
n dependent coordinates to s = n — m independent generalized coordinates. That is, the correlations due 
to the constraint forces are embedded into the generalized coordinates. Then equation 6.15 reduces to the 
basic Euler differential equations. 
d OL OL 
dt 0g, Ogi 
Note that equation 6.16 is identical to Euler’s equation 5.34, if the independent variable x is replaced 
by time t. Thus Newton’s equation of motion are equivalent to minimizing the action integral S = de Ldt, 
that is 


=0 (6.16) 


ta 
ti 


which is Hamilton's Principle. Hamilton's Principle underlies many aspects of physics and as discussed in 
chapter 9, and is used as the starting point for developing classical mechanics. Hamilton's Principle was 
postulated 46 years after Lagrange introduced Lagrangian mechanics. 

The above plausibility argument, which is based on Newtonian mechanics, illustrates the close connection 
between the vectorial Newtonian mechanics and the algebraic Lagrangian mechanics approaches to classical 
mechanics. 
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6.3 Lagrange equations from d’Alembert’s Principle 


6.3.1 d’Alembert’s Principle of Virtual Work 


The Principle of Virtual Work provides a basis for a rigorous derivation of Lagrangian mechanics. Bernoulli 
introduced the concept of virtual infinitessimal displacement of a system mentioned in chapter 5.9.1. This 
refers to a change in the configuration of the system as a result of any arbitrary infinitessimal instantaneous 
change of the coordinates dr;, that is consistent with the forces and constraints imposed on the system at 
the instant t. Lagrange’s symbol 6 is used to designate a virtual displacement which is called “virtual” to 
imply that there is no change in time t, i.e. 6¢ = 0. This distinguishes it from an actual displacement dr; of 
body « during a time interval dt when the forces and constraints may change. 

Suppose that the system of n particles is in equilibrium, that is, the total force on each particle i is 
zero. The virtual work done by the force F; moving a distance dr; is given by the dot product F; - ór;. For 
equilibrium, the sum of all these products for the N bodies also must be zero 


N 
XOF; br; = 0 (6.18) 
i 
Decomposing the force F; on particle ¿ into applied forces FA and constraint forces fo gives 


N N 
SOF? ori +Y fS- dri = 0 (6.19) 


The second term in equation 6.19 can be ignored if the virtual work due to the constraint forces is zero. 
This is rigorously true for rigid bodies and is valid for any forces of constraint where the constraint forces 
are perpendicular to the constraint surface and the virtual displacement is tangent to this surface. Thus if 
the constraint forces do no work, then (6.19) reduces to 


N 


NE? or; =0 (6.20) 


i 


This relation is the Bernoulli’s Principle of Static Virtual Work and is used to solve problems in statics. 
Bernoulli introduced dynamics by using Newton’s Law to related force and momentum. 


F; =p; (6.21) 


Equation (6.21) can be rewritten as 
F;- p; =0 (6.22) 


In 1742, d’Alembert developed the Principle of Dynamic Virtual Work in the form 


N 
YF — pi) ôr; =0 (6.23) 
Using equations (6.19) plus (6.23) gives 
N N 
Y (Ef — p:) Ore +) ES or, =0 (6.24) 


2 $ 


For the special case where the forces of constraint are zero, then equation 6.24 reduces to d’Alembert’s 


Principle 
N 


> (E? — Pi): ôr; =0 (6.25) 
i 
d’Alembert’s Principle, by a stroke of genius, cleverly transforms the principle of virtual work from the realm 
of statics to dynamics. Application of virtual work to statics primarily leads to algebraic equations between 
the forces, whereas d’Alembert’s principle applied to dynamics leads to differential equations. 
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6.3.2 Transformation to generalized coordinates 


In classical mechanical systems the coordinates dr; usually are not independent due to the forces of constraint 
and the constraint-force energy contributes to equation 6.24. These problems can be eliminated by expressing 
d’Alembert’s Principle in terms of virtual displacements of n independent generalized coordinates q; of the 
system for which the constraint force term >>) f° - dq; = 0. Then the individual variational coefficients ôq; 
are independent and (Ff — p;) - ôq; = 0 can be equated to zero for each value of i. 
The transformation of the N-body system to n independent generalized coordinates q, can be expressed 
as 
ri = ri (q1, 92, 43---, Ins t) (6.26) 


Assuming n independent coordinates, then the velocity v; can be written in terms of general coordinates qk 
using the chain rule for partial differentiation. 


pot y OF: at Or: (6.27) 
J 


The arbitrary virtual displacement dr; can be related to the virtual displacement of the generalized coordinate 
ôq; by 

or; = 8g; 99; (6.28) 
Note that by definition, a virtual displacement considers only displacements of the coordinates, and no time 
variation dt is involved. 


The above transformations can be used to express d'Alembert's dynamical principle of virtual work in 
generalized coordinates. Thus the first term in d'Alembert's Dynamical Principle, (6.25) becomes 


n 


Y Ff dr; = Y FP. 
i ij 


where Q; are called components of the generalized force,! defined as 


Ə n 
= ðq = >> O56; (6.29) 
j 


r; 
qj 


z Or; 
Q=) Ff. 3a; (6.30) 


a 


Note that just as the generalized coordinates q; need not have the dimensions of length, so the Q; do not 
necessarily have the dimensions of force, but the product (2;0q, must have the dimensions of work. For 
example, Q; could be torque and dq, could be the corresponding infinitessimal rotation angle. 

The second term in d'Alembert's Principle (6.25) can be transformed using equation 6.28 


E -ri = ai -ri = (E Miki - z) 99; (6.31) 
i i i a 


The right-hand side of (6.31) can be rewritten as 


= Or; R d Or; d (Or; 
#,-— | dq; = ee ses i 6.32 
(Er Fe) w5 g (mts Be) ea (Se) ba (6) 


i 


Note that equation (6.27) gives that 


Ov; Or; 
n 6.33 
Ba, Os (6.33) 
therefore the first right-hand term in (6.32) can be written as 
d Or; d Ov; 
= | mir; - == (Mavi S. 6.34 
dt (m ý a dt (m el e) ) 


1 This proof, plus the notation, conform with that used by Goldstein [Go50] and by other texts on classical mechanics. 
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The second right-hand term in (6.32) can be rewritten by interchanging the order of the differentiation with 


respect to ¢ and q; 
d Or; Ov; 
oF (5) E7 (6.35) 


Substituting (6.34) and (6.35) into (6.32) gives 


z Nfa OV; Ov; 
i i J J 


Inserting (6.29) and (6.36) into d’Alembert’s Principle (6.25) leads to the relation 


Å. l EN te ð (&ı 
NE? — P;) - ôr; = i (5 (= sre?) E7 bs git | - a, ón =0 (6.37) 


i 


The >; mv? term can be identified with the system kinetic energy T. Thus d'Alembert Principle reduces 


2 
to the relation 
N 
d (OT oT 
— | — ôq; = 0 6.38 
Sea) mJ |= om 
For cartesian coordinates T is a function only of velocities (+, y, 2) and thus the term or = 0. However, 
3 
as discussed in appendix C.2.2, for curvilinear coordinates 5 Æ 0 due to the curvature of the coordinates 
as is illustrated for polar coordinates where v =řf + r00. 
If all the n generalized coordinates q; are independent, then equation 6.38 implies that the term in the 
square brackets is zero for each individual value of j. This leads to the basic Euler-Lagrange equations of 
motion for each of the independent generalized coordinates 


d (OT OT 
TORR om 


where n > j > 1. That is, this leads to n Euler-Lagrange equations of motion for the generalized forces Qj. 
As discussed in chapter 5.8, when m holonomic constraint forces apply, it is possible to reduce the system 
to s = n — m independent generalized coordinates for which equation 6.25 applies. 

In 1687 Leibniz proposed minimizing the time integral of his “vis viva”, which equals 2T. That is, 


ta 
ô f Tdt =0 (6.40) 


The variational equation 6.39 accomplishes the minimization of equation 6.40. It is remarkable that Leibniz 
anticipated the basic variational concept prior to the birth of the developers of Lagrangian mechanics, i.e., 
d’Alembert, Euler, Lagrange, and Hamilton. 


6.3.3 Lagrangian 


The handling of both conservative and non-conservative generalized forces Q; is best achieved by assuming 
that the generalized force Q; = >; FA. sE can be partitioned into a conservative velocity-independent term, 
qj 


that can be expressed in terms of the a of a scalar potential, — VU;, plus an excluded generalized force 
QPZ which contains the non-conservative, velocity-dependent, and all the constraint forces not explicitly 
included in the potential U;. That is, 

Qj = -VU; + Q7* (6.41) 


Inserting (6.41) into (6.38), and assuming that the potential U is velocity independent, allows (6.38) to be 


rewritten as See eee 
2 E A T H, -= op] dq; =0 (6.42) 
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The definition of the Standard Lagrangian is 
L=T-U (6.43) 
then (6.42) can be written as 
N 


tela) x} oF | dq, = 0 (6.44) 


Note that equation (6.44) contains the basic Euler-Lagrange equation (6.38) as a special case when U = 0. 
In addition, note that if all the generalized coordinates are independent, then the square bracket terms are 
zero for each value of j, which leads to the general Euler-Lagrange equations of motion 


d (OL ƏL) Ex 
fe (57) 7 a = von 
wheren >j > 1. 


Chapter 6.5.3 will show that the holonomic constraint forces can be factored out of the generalized force 
term Q7* which simplifies derivation of the equations of motion using Lagrangian mechanics. The general 
Euler-Lagrange equations of motion are used extensively in classical mechanics because conservative forces 
play a ubiquitous role in classical mechanics. 


6.4 Lagrange equations from Hamilton’s Action Principle 


Hamilton published two papers in 1834 and 1835, announcing a fundamental new dynamical principle that 
underlies both Lagrangian and Hamiltonian mechanics. Hamilton was seeking a theory of optics when he 
developed Hamilton’s Action Principle, plus the field of Hamiltonian mechanics, both of which play a crucial 
role in classical mechanics and modern physics. Hamilton’s Action Principle states “ dynamical systems 
follow paths that minimize the time integral of the Lagrangian”. That is, the action functional S 


S= | úl L(q, 4,t)dt (6.46) 


ti 


has a minimum value for the correct path of motion. Hamilton’s Action Principle can be written in 
terms of a virtual infinitessimal displacement 6, as 


ta 
5S=6 i, Ldt =0 (6.47) 
ti 


Variational calculus therefore implies that a system of s independent generalized coordinates must satisfy 
the basic Lagrange-Euler equations 

ae — 2 =0 (6.48) 

dt ðġj 095 
Note that for Q7* = 0, this is the same as equation 6.45 which was derived using d’Alembert’s Principle. 

This discussion has shown that Euler’s variational differential equation underlies both the differential vari- 

ational d’Alembert Principle, and the more fundamental integral Hamilton’s Action Principle. As discussed 
in chapter 9.2, Hamilton’s Principle of Stationary Action adds a fundamental new dimension to classical 
mechanics which leads to derivation of both Lagrangian and Hamiltonian mechanics. That is, both Hamil- 
ton’s Action Principle, and d’Alembert’s Principle, can be used to derive Lagrangian mechanics leading to 
the most general Lagrange equations that are applicable to both holonomic and non-holonomic constraints, 
as well as conservative and non-conservative systems. In addition, Chapter 6.2 presented a plausibility ar- 
gument showing that Lagrangian mechanics can be justified based on Newtonian mechanics. Hamilton’s 
Action Principle, and d’Alembert’s Principle, can be expressed in terms of generalized coordinates which is 
much broader in scope than the equations of motion implied using Newtonian mechanics. 
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6.5 Constrained systems 


The motion for systems subject to constraints is difficult to calculate using Newtonian mechanics because 
all the unknown constraint forces must be included explicitly with the active forces in order to determine 
the equations of motion. Lagrangian mechanics avoids these difficulties by allowing selection of independent 
generalized coordinates that incorporate the correlated motion induced by the constraint forces. This allows 
the constraint forces acting on the system to be ignored by reducing the system to a minimal set of generalized 
coordinates. The holonomic constraint forces can be determined using the Lagrange multiplier approach, or 
all constraint forces can be determined by including them as generalized forces, as described below. 


6.5.1 Choice of generalized coordinates 


As discussed in chapter 5.8, the flexibility and freedom for selection of generalized coordinates is a consid- 
erable advantage of Lagrangian mechanics when handling constrained systems. The generalized coordinates 
can be any set of independent variables that completely specify the scalar action functional, equation 6.46. 
The generalized coordinates are not required to be orthogonal as is required when using the vectorial New- 
tonian approach. The secret to using generalized coordinates is to select coordinates that are perpendicular 
to the constraint forces so that the constraint forces do no work. Moreover, if the constraints are rigid, then 
the constraint forces do no work in the direction of the constraint force. As a consequence, the constraint 
forces do not contribute to the action integral and thus the >); £© - Sr; term in equation 6.19 can be omit- 
ted from the action integral. Generalized coordinates allow reducing the number of unknowns from n to 
s = n— m when the system has m holonomic constraints. In addition, generalized coordinates facilitate 
using both the Lagrange multipliers, and the generalized forces, approaches for determining the constraint 
forces. 


6.5.2 Minimal set of generalized coordinates 


The set of n generalized coordinates q; are used to describe the motion of the system. No restrictions have 
been placed on the nature of the constraints other than they are workless for a virtual displacement. If the 
m constraints are holonomic, then it is possible to find sets of s = n— m independent generalized coordinates 
q; that contain the m constraint conditions implicitly in the transformation equations 


r; = ri(q1, 92, 43--- Qs, t) (6.49) 


For the case of s = n — m unknowns, any virtual displacement dq; is independent of dqx, therefore the 
only way for (6.44) to hold is for the term in brackets to vanish for each value of 7, that is 


d (OL OL 
a (a) a] jii 


where j = 1,2,3,.. s. These are the Lagrange equations for the minimal set of s independent generalized 
coordinates. 

If all the generalized forces are conservative plus velocity independent, and are included in the potential 
U, and On = 0, then (6.50) simplifies to 


fe a z a a (6.51) 


This is Euler's differential equation, derived earlier using the calculus of variations. Thus d'Alembert's 
Principle leads to a solution that minimizes the action integral ô f He Ldt = 0 as stated by Hamilton’s 
Principle. 


6.5.3 Lagrange multipliers approach 


Equation (6.44) sums over all n coordinates for N particles, providing n equations of motion. If the m 
constraints are holonomic they can be expressed by m algebraic equations of constraint 


Gk (41, 92; --Gn; t) = 0 (6.52) 
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where k = 1,2,3,...m. Kinematic constraints can be expressed in terms of the infinitessimal displacements 
of the form 
n 
09% ( 


091 
<= (q, t) das + -= E dt =0 (6.53) 
j=1 eg 


ot 


where k = 1,2,3,...m, j = 1,2,3,...n, and where the ae , and Oot are functions of the generalized coordinates 
qj, described by the vector q, that are derived from ne P A of constraint. As discussed in chapter 5.7, 
if (6.53) represents the total differential of a function, then it can be integrated to give a holonomic relation 
of the form of equation (6.52). However, if (6.53) is not the total differential, then it can be integrated only 
after having solved the full problem. If ate = 0 then the k*” constraint is scleronomic. 

The discussion of Lagrange multipliers in chapter 5.9.1, showed that, for virtual displacements 4q;, 
the correlation of the generalized coordinates, due to the constraint forces, can be taken into account by 
multiplying (6.53) by unknown Lagrange multipliers A; and summing over all m constraints. Generalized 
forces can be partitioned into a Lagrange multiplier term plus a remainder force. That is 


QPX = Aa HOES (6.54) 


since by definition ôt = 0 for virtual displacements. 

Chapter 5.9.1 showed that holonomic forces of constraint can be taken into account by introducing 
the Lagrange undetermined multipliers approach, which is equivalent to defining an extended Lagrangian 
L'(q, 4, A,t) where 


L'(q, 4, At) = L(q, åt DI 29k (q) (6.55) 
k=1j=1 


Finding the extremum for the extended Lagrangian £' (q, å, A,t) using (6.47) gives 


y pea EE BOO ue 
© |i (si) ~ ae} Later] o eas 


where Qe C is the remaining part of the generalized force Q; after subtracting both the part of the force 
absorbed in the potential energy U, which is buried in the Lagrangian L, as well as the holonomic constraint 
forces which are included in the Lagrange multiplier terms S77", Ax oF (q,t). The m Lagrange multipliers 


Az can be chosen arbitrarily in (6.56) . Utilizing the free choice of the m Lagrange multipliers Az allows them 
to be determined in such a way that the coefficients of the first m infinitessimals, i.e. the square brackets 
vanish. Therefore the expression in the square bracket must vanish for each value of 1 < j < m. Thus it 
follows that 


d ( OL ) OL ) 09, pee 
- Ara (qui) -Q7 =0 (6.57) 
{ dt 0d; 09; 2 09; 2 
when j = 1,2,..m. Thus (6.56) reduces to a sum over the remaining coordinates between m+1<j <n 
Z d ( OL ) OL ) m gk EXC 
: àr (a; t) — Q; oq; =0 (6.58) 
2) ee 


In equation (6.58) the s = n — m infinitessimals q; can be chosen freely since the s = n — m degrees 
of freedom are independent. Therefore the expression in the square bracket must vanish for each value of 
m+1<j<n. Thus it follows that 


d (OL OL (OOK mace 
; À „t ; =0 6.59 


where j = m+1,m+2,..n. Combining equations (6.57) and (6.59) then gives the important general relation 


that for1<j<n 
OL LA EXC 
(3 ( on) 3 ae 5 b= A (at) +Q! (6.60) 
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To summarize, the Lagrange multiplier approach (6.60) automatically solves the n equations plus the 
m holonomic equations of constraint, which determines the n + m unknowns, that is, the n coordinates 
plus the m forces of constraint. The beauty of the Lagrange multipliers is that all n variables, plus the m 
constraint forces, are found simultaneously by using the calculus of variations to determine the extremum 
for the expanded Lagrangian L'(q, 4, A,t). 


6.5.4 Generalized forces approach 


The two right-hand terms in (6.60) can be understood to be those forces acting on the system that are 

not absorbed into the scalar potential U component of the Lagrangian L. The Lagrange multiplier terms 

aay de $28 (q, t) account for the holonomic forces of constraint that are not included in the conservative 
J 


potential or in the generalized forces Q7* C. The generalized force 


n 
EXC a Ori 
QTE dF 3a; (6.17) 
is the sum of the components in the q; direction for all external forces that have not been taken into account 
by the scalar potential or the Lagrange multipliers. Thus the non-conservative generalized force Qe e 
contains non-holonomic constraint forces, including dissipative forces such as drag or friction, that are not 
included in U, or used in the Lagrange multiplier terms to account for the holonomic constraint forces. 

The concept of generalized forces is illustrated by the case of spherical coordinate systems. The attached 
table gives the displacement elements 6q;, (taken from table C4) and the generalized force for the three 
coordinates. Note that Q; has the dimensions of force and Q;.6q has the units of energy. By contrast 
equation 6.30 gives that Qg = Fer and Qg = Fyr which have the dimensions of torque. However, (29400 and 
Q 406 both have the dimensions of energy as is required in equation 6.30. This illustrates that the units used 
for generalized forces depend on the units of the corresponding generalized coordinate. 


Unit vectors 


Tr 
6 
$ 


ér sin Odd | oF yrsin@ | Fyr sin dé 


6.6 Applying the Euler-Lagrange equations to classical mechanics 


d’Alembert’s principle of virtual work has been used to derive the Euler-Lagrange equations, which also 
satisfy Hamilton’s Principle, and the Newtonian plausibility argument. These imply that the actual path 
taken in configuration space (q;,q;,t) is the one that minimizes the action integral f i L(qj,q;;t)dt. As a 
consequence, the Euler equations for the calculus of variations lead to the Lagrange equations of motion. 


d (OL OL OOK rO 
i = 0+0 6.60 
la (a) a] 2 “dq; es ve 


for n variables, with m equations of constraint. The generalized forces qe are not included in the 


conservative, potential energy U, or the Lagrange multipliers approach for holonomic equations of constraint.? 
The following is a logical procedure for applying the Euler-Lagrange equations to classical mechanics. 


1) Select a set of independent generalized coordinates: 


Select an optimum set of independent generalized coordinates as described in chapter 6.5.1. Use of generalized 
coordinates is always advantageous since they incorporate the constraints, and can reduce the number of 
unknowns, both of which simplify use of Lagrangian mechanics 


2Euler's differential equation is ubiquitous in Lagrangian mechanics. Thus, for brevity, it is convenient to define the concept 
of the Lagrange linear operator Aj, as described in appendix F2. 


where Aj operates on the Lagrangian L. Then Euler's equations can be written compactly in the form A¿£ =0. 
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2) Partition of the active forces: 


The active forces should be partitioned into the following three groups: 

(i) Conservative one-body forces plus the velocity-dependent electromagnetic force which 
can be characterized by the scalar potential U, that is absorbed into the Lagrangian. The gravitational 
forces plus the velocity-dependent electromagnetic force can be absorbed into the potential U as discussed 
in chapter 6.10. This approach is by far the easiest way to account for such forces in Lagrangian mechanics. 

(ii) Holonomic constraint forces provide algebraic relations that couple some of the generalized coor- 
dinates. This coupling can be used either to reduce the number of generalized coordinates, or to determine 
these holonomic constraint forces using the Lagrange multiplier approach. 

(iii) Generalized forces provide a mechanism for introducing non-conservative and non-holonomic 
constraint forces into Lagrangian mechanics. Typically general forces are used to introduce dissipative 
forces. 

Typical systems can involve a mixture of all three categories of active forces. For example, mechanical 
systems often include gravity, introduced as a potential, holonomic constraint forces are determined using 
Lagrange multipliers, and dissipative forces are included as generalized forces. 


3) Minimal set of generalized coordinates: 


The ability to embed constraint forces directly into the generalized coordinates is a tremendous advantage 
enjoyed by the Lagrangian and Hamiltonian variational approaches to classical mechanics. If the constraint 
forces are not required, then choice of a minimal set of generalized coordinates significantly reduces the 
number of equations of motion that need to be solved . 


4) Derive the Lagrangian: 


The Lagrangian is derived in terms of the generalized coordinates and including the conservative forces that 
are buried into the scalar potential U. 


5) Derive the equations of motion: 


Equation (6.60) is solved to determine the n generalized coordinates, plus the m Lagrange multipliers char- 
acterizing the holonomic constraint forces, plus any generalized forces that were included. The holonomic 
constraint forces then are given by evaluating the Ak 3A (q, t) terms for the m holonomic forces. 

In summary, Lagrangian mechanics is based on energies which are scalars in contrast to Newtonian 
mechanics which is based on vector forces and momentum. As a consequence, Lagrange mechanics allows 
use of any set of independent generalized coordinates, which do not have to be orthogonal, and they can 
have very different units for different variables. The generalized coordinates can incorporate the correlations 
introduced by constraint forces. 

The active forces are split into the following three categories; 


1. Velocity-independent conservative forces are taken into account using scalar potentials U;. 
2. Holonomic constraint forces can be determined using Lagrange multipliers. 


3. Non-holonomic constraints require use of generalized forces Q7* C 

Use of the concept of scalar potentials is a trivial and powerful way to incorporate conservative forces in 
Lagrangian mechanics. The Lagrange multipliers approach requires using the Euler-Lagrange equations for 
n +m coordinates but determines both holonomic constraint forces and equations of motion simultaneously. 
Non-holonomic constraints and dissipative forces can be incorporated into Lagrangian mechanics via use of 
generalized forces which broadens the scope of Lagrangian mechanics. 

Note that the equations of motion resulting from the Lagrange-Euler algebraic approach are the same 
equations of motion as obtained using Newtonian mechanics. However, the Lagrangian is a scalar which 
facilitates rotation into the most convenient frame of reference. This can greatly simplify determination of 
the equations of motion when constraint forces apply. As discussed in chapter 17, the Lagrangian and the 
Hamiltonian variational approaches to mechanics are the only viable way to handle relativistic, statistical, 
and quantum mechanics. 
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6.7 Applications to unconstrained systems 


Although most dynamical systems involve constrained motion, it is useful to consider examples of systems 
subject to conservative forces with no constraints. For no constraints, the Lagrange-Euler equations (6.60) 
simplify to A¿L = 0 where j = 1,2,..n, and the transformation to generalized coordinates is of no conse- 
quence. 


6.1 Example: Motion of a free particle, U=0 


The Lagrangian in cartesian coordinates is L = ¿m(4? + 3? + 27). Then 


OL ; 
de  “ 
OL , 
— — mM 
ay $ 
OL ; 
— — mM 
Oz 
OL — OL OL | 
Ox — Oy. 02 
Insert these in the Lagrange equation gives 
dóL OL 
Al = >= — => mi-0= 
didz Ox dt 
Thus 
Pe = mi = constant 
Py = my = constant 


mz = constant 


3 
x 
| 


That is, this shows that the linear momentum is conserved if U is a constant, that is, no forces apply. Note 
that momentum conservation has been derived without any direct reference to forces. 


6.2 Example: Motion in a uniform gravitational field 


Consider the motion is in the x — y plane. The 
kinetic energy T = im Es + y”) while the potential 
energy is U = mgy where U(y =0) =0. Thus 


1 : y 
L= gm ES +i) —mgy 


Using the Lagrange equation for the x coordinate 
gives 


OE. ƏL d 
~ dtðr Ox dt 


Thus the horizontal momentum mz is conserved and 
x =0. The y coordinate gives 


dóL OL d Motion in a gravitational field 
ue Op de ee 

Thus the Lagrangian produces the same results as de- 

rived using Newton’s Laws of Motion. 
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The importance of selecting the most convenient generalized coordinates is nicely illustrated by trying to 
solve this problem using polar coordinates r,0, where r is radial distance and 0 the elevation angle from the 
x aris as shown in the adjacent figure. Then 


1 1 1 2 
PE +m (rò) 


2 2 
U = mgr sin0 
Thus 
Loga d y? y 
L= ¿mr + zm (rò) — mgr sin 0 


A,L = Q0 for the r coordinate ; 
rd —gsind-#=0 


AgL = 0 for the 6 coordinate i 7 
—gr cos@ — 2r70 —r?0 = 0 


These equations written in polar coordinates are more complicated than the result expressed in cartesian 
coordinates. This is because the potential energy depends directly on the y coordinate, whereas it is a function 
of both r,0. This illustrates the freedom for using different generalized coordinates, plus the importance of 
choosing a sensible set of generalized coordinates. 


6.3 Example: Central forces 


Consider a mass m moving under the influence of a spherically-symmetric, conservative, attractive, 


inverse-square force. The potential then is 


k 
Ü= 
r 


It is natural to express the Lagrangian in spherical coordinates for this system. That is, 


A eee oe em eee 
L= ¿mí + 5m (rd) + ¿mir sin 99) y 


A,L=0 for the r coordinate gives 


mi — mr” + sin? 09] = = 


.2 
where the mrsin?* 6¢ term comes from the centripetal acceleration. 
AgL =0 for the $ coordinate gives 
d ; 
T (me? sin? 60) =0 
This implies that the derivative of the angular momentum about the ¢ axis, py = 0 and thus py = mr? sin? 96 
is a constant of motion. 
AgL = 0 for the 0 coordinate gives 


3 .2 
go’ —mr*sin@cos6¢ = 0 
That is, 

2 
A Ns .2 Po COS 0 
= mrí“sin0cosdp = == 
me $ 2mr? sin? 6 
Note that pg is a constant of motion if pg = 0 and only the radial coordinate is influenced by the radial form 
of the central potential. 
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6.8 Applications to systems involving holonomic constraints 


The equations of motion that result from the Lagrange-Euler algebraic approach are the same as those given 
by Newtonian mechanics. The solution of these equations of motion can be obtained mathematically using 
the chosen initial conditions. The following simple example of a disk rolling on an inclined plane, is useful 
for comparing the merits of the Newtonian method with Lagrange mechanics employing either minimal 


generalized coordinates, the Lagrange multipliers, or the generalized 


6.4 Example: Disk rolling on an inclined plane 


Consider a disk rolling down an inclined plane to compare 
the results obtained using Newton’s laws with the results ob- 
tained using Lagrange’s equations with either generalized coor- 
dinates, Lagrange multipliers, or generalized forces. All these 
cases assume that the friction is sufficient to ensure that the 
rolling equation of constraint applies and that the disk has a 
radius R and moment of inertia of I. Assume as generalized 
coordinates, distance along the inclined plane y which is per- 
pendicular to the normal constraint force N, and perpendicular 
to the inclined plane x, plus the rolling angle 0. The constraint 
for rolling is holonomic 


y— RO=0 


The frictional force is F's. The constraint that it rolls along the 
plane implies 
i= R= 


a) Newton’s laws of motion 


forces approaches. 


Disk rolling without slipping on an 
inclined plane. 


Newton’s law for the components of the forces along the inclined plane gives 


mg sina — Fs = my 
Perpendicular to the inclined plane, Newton's law gives 
mg cosa = N 


The torque on the disk gives 


F;R=10 
Assuming the disc rolls gives A 
y = RO 
then 1 
Ep = qa 


Inserting this into equation (a) gives 


I 
(m+ j]j- mgsina=0 


The moment of inertia of a uniform solid circular disk is I = imR? 


Therefore 
2 


y= 39 sina 
and the frictional force is 


F; = sina 


which is smaller than the gravitational force along the plane which is mgsina. 
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b) Lagrange equations with a minimal set of generalized coordinates 


Using the generalized coordinates defined above, the total kinetic energy is 


1 1.2 
T = mý? + -I0 

gy + 
The conservative gravitational force can be absorbed into the potential energy 


U = mg(l — y)sina 


Thus the Lagrangian is 


1 1.2 
L= nur + 518 mg(l — y) sino 
The holonomic equations of constraint are 


g = y—-RO=0 
g = «-R=0 


A holonomic constraint can be used to reduce the system to a single generalized coordinate y plus generalized 
velocity y. Expressed in terms of this single generalized coordinate, the Lagrangian becomes 


IN. ; 
b= 5 (m+ a)? mg(l — y) sina 


The Lagrange equation A,L = 0 gives 


EA 
mg sina = (m+ 2) y 
Again if I = mR? then 
2 
y= 39 sina 


The solution for the x coordinate is trivial. This answer is identical to that obtained using Newton’s laws 
of motion. Note that no forces have been determined using the single generalized coordinate. 


c) Lagrange equation with Lagrange multipliers 


Again the conservative gravitation force is absorbed into the scalar potential while the holonomic constraints 
are taken into account using Lagrange multipliers. Ignoring the trivial x dependence, the Lagrangian is given 
above to be 


L= smi + sli mg(l — y) sina 
The constraint equations are 
g = y-R0=0 
gg = «-R=0 
The Lagrange equation for the y coordinate 


dOL_0L_, agi 
doy dy “oy 


+ A20 


gives 
my —mgsina = A 


The Lagrange equation for the 0 coordinate 


d dL _ OL _, ag 


Bon 00 = ap Pe 
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which gives 


I = -AR 

The constraint can be written as i 

y = RO 
Let I = ¿MR? and solve for y,0 and A gives 

Ai = 2 een eae 
042) 3 
The frictional force is given by 
O 
F; = MG =, = -25 sina 


Also 


2 
my = mgsina+ A] = gig sina 


and the torque is E 
—-\R= FR = 10 


d) Lagrange equation using a generalized force 


Again the conservative gravitation force is absorbed into the scalar potential while the holonomic constraints 
are taken into account using generalized forces. Ignoring the trivial x dependence, the Lagrangian was given 
above to be 


1 1. 
L= mi? + 510 mg(l — y) sino 


The generalized forces (6.30) are 


Qy z =F} 
Qo = F;R 


The Euler-Lagrange equations are: 
The AyL = Qy Lagrange equation for the y coordinate 


my — mgsina = Qy = —Fj 
The AgL = Qo Lagrange equation for the 0 coordinate 


I = Qo = F;R 
The constraint equation gives that y = R0 and assuming I = imR? leads to the Qg relation 
Qo m.. 
Nes fes 
A ae 


Substitute this equation into the Qy relation gives that 


m 
my — mgsina = Qy = -Fj = zÏ 


Thus 
y= 39 sina 
and 
Fi = = sin & 


The four methods for handling the equations of constraint all are equivalent and result in the same 
equations of motion. The scalar Lagrangian mechanics is able to calculate the vector forces acting in a direct 
and simple way. The Newton's law approach is more intuitive for this simple case and the ease and power 
of the Lagrangian approach is not apparent for this simple system. 

The following series of examples will gradually increase in complexity, and will illustrate the power, 
elegance, plus superiority of the Lagrangian approach compared with the Newtonian approach. 
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6.5 Example: Two connected masses on frictionless inclined planes 


Consider the system shown in the figure. This is 
a problem that has five constraints that will be solved 
using the method of generalized coordinates. The ob- 
vious generalized coordinates are x1 and x2 which are 
perpendicular to the normal constraint forces on the 
inclined planes. Another holonomic constraint is that 
the length of the rope connecting the masses is assumed 
to be constant. Thus the equation of constraint is that 


%1+%2—-1=0 


The other four constraints ensure that the two masses 
slide directly down the inclined planes in the plane Two connected masses on frictionless inclined 
shown. This is assumed implicitly by using only the planes 

variables, xı and x2. Let us chose xı as the primary 

generalized coordinate, thus 


Ta = l— XY 
yı = xsin 
y = (l—21)sin 62 


The conservative gravitational force is absorbed into the potential energy given by 


= =m19T1 sin 6; — M29 (l = 11) sin 0, 
Since xı =—a the kinetic energy is given by 
1 1 1 
T= zmt + gmake 2 (mi + ma) 27 


The Lagrangian then gives that 


1 
L= 5 (mi + ma) ae + migz: sin 6, + mag (l — xı) sin 9, 


Therefore 
SE (mimi 
De, = (My + M2) Ly 
L 
or = g(m;,sin0, — ma sin 62) 
0x1 
Thus 
d OL L 
Ay b= SEE =0= (m + ma) #1 — g (m; sin 01 — ma sin 02) 


Note that the system acts as though the inertial mass is (mı + ma) 
while the driving force comes from the difference of the forces. The 
acceleration is zero if 


mi sin 01 = Ma sin 02 


A special case of this is the Atwood’s machine with a massless 
pulley shown in the adjacent figure. For this case 01 = 02 = 90°. Atwoods machine 
Thus 


(mı + mz) #1 = g (mı — m2) 


Note that this problem has been solved without any reference to the 
force in the rope or the normal constraint forces on the inclined planes. 
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6.6 Example: Two blocks connected by a friction- 
less bar 


Two identical masses m are connected by a massless 
rigid bar of length l, and they are constrained to move 
in two frictionless slides, one vertical and the other hor- 
izontal as shown in the adjacent figure. Assume that the 
conservative gravitational force acts along the negative y 
axis and is incorporated into the scalar potential U. The 
generalized coordinate can be chosen to be the angle a 
corresponding to a single degree of freedom. The relative 
cartesian coordinates of the blocks are given by 


x = Icosa 
= lsina 
Thus 
t = —l(sinaja 7 
= : Two frictionless masses that are connected by a 
= I(cosa)a 


bar and are constrained to slide in vertical and 
This constraint, that is absorbed into the generalized co- horizontal channels. 

ordinate, is holonomic, scleronomic, and conservative. 

The kinetic energy is given by 


1 1 
T=5m (P (sin a)? + 1? (cosa) à?) = =ml?a? 


The gravitational potential energy is given by 
U = mgy = mgl sin a 


Thus the Lagrangian is 
1 
L= zea — mgl sin o: 
Using the Lagrange operator equation AgL = 0 gives 
mléá+mglcosa = 0 
a+ 7 cosa = 0 
Multiply by à yields 


aa+ Facosa =0 


This can be integrated to give 


where c is a constant. That is 


Separation of the variable gives 


Integration of this gives 
i da 
fei === 
ao 4/2 (c— 2sina) 


The constants c and to are determined from the given initial conditions. 
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6.7 Example: Block sliding on a movable frictionless inclined plane 


Consider a block of mass m free to slide on a smooth 
frictionless inclined plane of mass M that is free to slide 
horizontally as shown in the adjacent figure. The six de- 
grees of freedom can be reduced to two independent gen- 
eralized coordinates since the inclined plane and mass m 
are confined to slide along specific non-orthogonal direc- 
tions. Choose x as the coordinate for movement of the 
inclined plane in the horizontal î direction and x’ the 
position of the block with respect to the surface of the 
inclined plane in the € direction which is inclined down- 
ward at an angle 0. Thus the velocity of the inclined 
plane is 


Na A block sliding on a frictionless movable inclined 
while the velocity of the small block on the inclined plane plane. 
18 

v=ir +64 


The kinetic energy is given by 


1 1 1 1 
T= 5MV-Vt5mv-v= ¿Mi? + ¿mii +4? + 2:54 cos 0] 


The conservative gravitational force is absorbed into the scalar potential energy which depends only on the 
vertical position of the block and is taken to be zero at the top of the wedge. 


U = —mgr' sind 


Thus the Lagrangian is 
1 1 
L= ¿Mi? + ¿mii +47 + 244! cos 0] + mgr' sin 0 
Consider the Lagrange-Euler equation for the x coordinate, A, L = 0 which gives 


L [m(i-+ 4! cos 0) + Má] =0 (a) 
which states that [m(t + t’ cos@) + Ma] is a constant of motion. This constant of motion is just the total 
linear momentum of the complete system in the x direction. That is, conservation of the linear momentum 
is satisfied automatically by the Lagrangian approach. The Newtonian approach also predicts conservation of 
the linear momentum since there are no external horizontal forces, 

Consider the Lagrangian equation for the x’ coordinate A,,L = 0 which gives 


d 
yt + teose] = gsinð (b) 


Perform both of the time derivatives for equations a and b give 


m[i+3'cos0] + Më = 0 
i +¢cos? = gsin@ 
Solving for % and x’ gives 
VEN —g sin 0 cos 0 
“= (m+ M)/m — cos? 0 
and. > 
y gsin0 


~ 1 =mcos20/(m + M) 


This example illustrates the flexibility of being able to use non-orthogonal displacement vectors to specify the 
scalar Lagrangian energy. Newtonian mechanics would require more thought to solve this problem. 
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6.8 Example: Sphere rolling without slipping down an inclined plane on a 
frictionless floor. 


A sphere of mass m and radius r rolls, without slipping, down an inclined plane, of mass M, sitting on a 
frictionless horizontal floor as shown in the adjacent figure. The velocity of the rolling sphere has horizontal 
and vertical components of 


Ve = i+ Rô cosg 
vy = —ROsiny 


Assume initial conditions are t = 0,€ = 0,2 =0,0=0,y=h,& 0 = 0. Choose the independent coordinates 
x and @ as generalized coordinates plus the holonomic constraint € = RO. Then the Lagrangian is 
M 


L=58 + > E +ro0 + 2ré6 cosy] + re mg (h—rósing) 


Lagrange’s equations A,L=0 and AgL = Q, give 
(M+m)é+mrécosp = 0 
To. 
Ë cos p + zr? —gsing = 0 
Eliminating & gives 
7 mco); sing 
5 M+m r 


Integrate this equation assuming the initial conditions, 
results in 


de 5(M +m) sin p 2 
~ 2[7 (M +m) — 5m cos? pj? 
Thus Solid sphere rolling without slipping on an 
mrcosp, 5msin (2p) g inclined plane on a frictionless horizontal floor. 


“2 Mim” 4[7(M +m) — 5mcos? y] 


Note that these equations predict conservation of linear 
momentum for the block plus sphere. 


6.9 Example: Mass sliding on a rotating straight frictionless rod. 


Consider a mass m sliding on a frictionless rod that 
rotates about one end of the rod with an angular velocity 


9. Choose r and @ to be generalized coordinates. Then 
the kinetic energy is given by 


1 1 . 
JE = ¿ne + smi 


and potential energy a 
U=0 


r 


The Lagrange equation for 0 gives Mass sliding on a rotating straight frictionless 


dðL ƏL d, 3; rod. 
a o a 
Thus the angular momentum is constant 


mr?0 = constant = Po 
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The Lagrange equation for r gives 

_ d0OL OL 
dt Or ðr 
The @ equation states that the angular momentum is conserved for this case which is what we expect since 


there are no external torques acting on the system. The r equation states that the centrifugal acceleration is 
ï = rw?. These equations of motion were derived without reference to the forces between the rod and mass. 


=k wire =0 


AL 


6.10 Example: Spherical pendulum 


The spherical pendulum is a classic holonomic 
problem in mechanics that involves rotation plus os- 
cillation where the pendulum is free to swing in any 
direction. This also applies to a particle constrained U=0 
to slide in a smooth frictionless spherical bowl under 
gravity, such as a bar of soap in a wet hemispherical © 
sink. Consider the equation of motion of the spher- 
ical pendulum of mass m and length b shown in the 
adjacent figure. The most convenient generalized co- 
ordinates are r,0, with origin at the fulcrum, since 8 
the length is constrained to be r = b. The kinetic 
energy is 


1 ai . 
T = smb" + smb? sin? 06 
; m 
The potential energy Pp 


U = —mgbcos 0 
i Spherical pendulum 
giving that 


1 : 1 . 
L= zmt” + ¿mo sin” 00° + mgb cos 0 


The Lagrange equation for 0 
_dóL OL | 
oL= ap o 
which gives 

mb?6 = mbo sin 0 cos 0 — mgb sin 0 
The Lagrange equation for y 


AoL 0L_d 


L= — = = —[mb’ sin? 64] = 
$ dap 06 a [mb* sin“ 0p] = 0 


which gives 
mb? sin? 0¢ = ps = constant 


This is just the angular momentum py for the pendulum rotating in the direction. Automatically the 
Lagrange approach shows that the angular momentum pg is a conserved quantity. This is what is expected 
from Newton’s Laws of Motion since there are no external torques applied about this vertical axis. 

The equation of motion for 0 can be simplified to 


2 
> g. p¿ cos 8 
6+ =sind = 
b m2b4 sin? 0 
There are many possible solutions depending on the initial conditions. The pendulum can just oscillate 

in the @ direction, or rotate in the @ direction or some combination of these. Note that if py is zero, then 
the equation reduces to the simple harmonic pendulum, while the other extreme is when 0 = 0 for which the 
motion is that of a conical pendulum that rotates at a constant angle Oy to the vertical axis. 


156 CHAPTER 6. LAGRANGIAN DYNAMICS 


6.11 Example: Spring plane pendulum 


A mass m is suspended by a spring with spring constant k in the gravitational field. Besides the longi- 
tudinal spring vibration, the spring performs a plane pendulum motion in the vertical plane, as illustrated in 
the adjacent figure. Find the Lagrangian, the equations of motion, and force in the spring. 

The system is holonomic, conservative, and scleronomic. Introduce plane polar coordinates with radial 
length r and polar angle 0 as generalized coordinates. The generalized coordinates are related to the cartesian 
coordinates by 


y = rcosé 


x = rsind 
Therefore the velocities are given by 


y = PcosO+rósinó 


t = fsind—récosé 
The kinetic energy is given by 
1 1 . 
T = 5m (é? +9?) = 5m (i? de 726°) 


The gravitational plus spring potential energies both can be absorbed 
into the potential U. 


Spring pendulum having spring 
U = —mgr cos 0 + E (r = roy constant k and oscillating in a 
vertical plane. 


where ro denotes the rest length of the spring. The Lagrangian thus equals 


1 . k 
L= ym (7 + 29%) + mgr cos 0 — 5 (r= ro)? 


For the polar angle 0, the Lagrange equation Ag L = 0 gives 

d i 

— (mr?) = —mgr sin 0 

The angular momentum po = mr20, thus the equation of motion can be written as 
po = —mgr sin 0 


Alternatively, evaluating 4 (mrd) gives 


mr? = —mgr sin 0 — 2mrih 


The last term in the right-hand side is the Coriolis force caused by the time variation of the pendulum length. 
For the radial distance r, the Lagrange equation A,.L = 0 gives 


mi = mr” +mgcos 6 — k (r — ro) 


This equation just equals the tension in the spring, i.e. F = mir. The first term on the right-hand side 
represents the centrifugal radial acceleration, the second term is the component of the gravitational force, 
and the third term represents Hooke’s Law for the spring. For small amplitudes of 0 the motion appears as 
a superposition of harmonic oscillations in the r,@ plane. 

In this example the orthogonal coordinate approach used gave the tension in the spring thus it is unnec- 
essary to repeat this using the Lagrange multiplier approach. 
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6.12 Example: The yo-yo 


Consider a yo-yo comprising a disc that has a string wrapped around it with one end attached to a fixed 
support. The disc is allowed to fall with the string unwinding as it falls as illustrated in the adjacent figure. 
Derive the equations of motion and the forces of constraint via use of Lagrange multipliers. Use y and ġ as 
independent generalized coordinates. 

The kinetic energy of the falling yo-yo is given by 
T= Imi? 4 I To= smi a Ima? ZZZZZZZZZZZZZZZZZZZZZZZZ7) 
where m is the mass of the disc, a the radius, and I = 
ima? is the moment of inertia of the disc about its central 
axis. The potential energy of the disc is 


U = —mgy 


Thus the Lagrangian is 


1 1 . 
L= my? + ¿may + mgy 


2 
The one equation of constraint is holonomic 
gly.) =y= ap =0 
The two Lagrange equations are 
The yo-yo comprises a falling disc unrolling 
L  doóL 
2 Uo 0 from a string attached to the disc at one end 


dy dt Oy’ * “Oy 

OL dOL | 199 

ð$ =dtdad’ Oo 
with only one Lagrange multiplier A. Evaluating these two Euler-Lagrange equations leads to two equations 
of motion 


and a fixed support at the other end. 
0 


mg=my+A = 0 
mató — da = 0 


Differentiating the equation of constraint gives 


¿=? 
a 
Inserting this into the second equation and solving the two equations gives 
1 
à=- 
on 
Inserting À into the two equations of motion gives 
ee et ed 
y= 39 
24 
Os, a 
The generalized force of constraint 
Og 1 
F == NES = —— 
and the constraint torque is 
Og 
N. = A—= == 
9=A5g = gmga 


Thus the string reduces the acceleration of the disc in the gravitational field by a factor of z, 
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6.13 Example: Mass constrained to move on the inside of a frictionless paraboloid 
A mass m moves on the frictionless inner surface of a paraboloid 
a +y =p? = az j 


with a gravitational potential energy of U = mgz. 

This system is holonomic, scleronomic, and conservative. Choose 
cylindrical coordinates p,p,z with respect to the vertical axis of the 
paraboloid to be the generalized coordinates. 

The Lagrangian is 


—~ 


1 A 
L= gm G + po + 2) — mgz 


The equation of constraint is 


g(p,2) =p"—az=0 a 


The Lagrange multiplier approach will be used to determine the forces 


of constraint. Mass constrained to slide on the 


For A,L = ba inside of a frictionless paraboloid. 
dóL OL 
— == — = 2 
ao or ~ NA (2) 
2 
m (i — pe ) = 129 
For AgL= AS 
as AN ee 
a (mp ) = py =0 (b) 


Thus the angular momentum pg is conserved, that is, it is a constant of motion. 
For A-L = AS 
mz =—mg-— Aja (c) 
and the time differential of the constraint equation is 


2pp — az =0 (d) 


The above four equations of motion can be used to determine r, ¢.z, Az. 
The radius of the circle at the intersection of the plane z = h, with the paraboloid p? = az, is given by 
Po = vah. For a constant height z = h, then 2 = 0 and equation (c) reduces to 


Assuming that ù = 0, then equation (a) for =w and p= Po gives 


mg 
m (0— pow?) = A12p9 = = ne = Fe 


That is, the constraint force equals 
F. = —mMpyuw* 
which is the usual centripetal force. These relations also give that the initial angular velocity required for 
such a stable trajectory with height h is 
29 


a 


¿== 
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6.14 Example: Mass on a frictionless plane connected to a plane pendulum 


Two masses mı and ma are connected by a string of 
length |. Mass mı is on a horizontal frictionless table 
and it is assumed that mass mz moves in a vertical plane. 
This is another problem involving holonomic constrained 
motion. The constraints are: 

1) mı moves in the horizontal plane 

2) mz moves in the vertical plane 

3) r+s=l. Therefore + = —$ 

There are 6-3 = 3 remaining degrees of freedom after 
taking the constraints into account. Choose as a set of 
generalized coordinates, r,0, and q. In terms of these three 
generalized coordinates, the kinetic energy is 


1 . 1 : 
T = ¿mu G + s26) + ¿ma (i? + 26°) 


1 2 a 2 2 1 .2 ane 
zm G (l r?) +m (+ +r°0 


Mass ma, hanging from a rope that is connected 


The potential energy in terms of the generalized coordi- to mı, which slides on a frictionless plane. 
nates relative to the horizontal plane, is 


U = 0 — magr cos 6 


Therefore the Lagrangian equals 
1 .2 2 2 1 2 27? 
L= 5m (1 + (1-1) $) + ma (i +r20%) + magr cos 


The differentials are 


OL 
Or 
OL 
Or 
OL 
00 
OL 
00 
OL 
Oo 
OL 
06 


= —m(l— ro + marÚ” + mgr cos 6 
= (mı +ma)r 
= —mgrsin0 
= mer 
= 0 
= m(l-r)’¢ 
Thus the three Lagrange equations are 
AL = (m,+mg)# +m (l—r)$ — morð” — mogcos6 = 0 


_ d 2; a 
AL = T [mer J + mggr sin = 0 


that is . mn 
2m2350 + 12m20 + mgr sind = 0 


This last equation is a statement of the conservation of angular momentum. These three differential equations 
of motion can be solved for known initial conditions. 
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6.15 Example: Two connected masses constrained to slide along a moving rod 


Consider two identical masses m, constrained to move 
along the axis of a thin straight rod, of mass M and length 
l, which is free to both translate and rotate. Two identi- 
cal springs link the two masses to the central point of the 
rod. Consider only motions of the system for which the 
extended lengths of the two springs are equal and opposite z 
such that the two masses always are equal distances from 
the center of the rod keeping the center of mass at the 
center of the rod. Find the equations of motion for this y 
system. 

Use a fixed cartesian coordinate system (x,y,z) and 
a moving frame with the origin O at the center of the 
rod with its cartesian coordinates (£1, y1, 21) being parallel 
to the fixed coordinate frame as shown in the figure. Let 
(r,0, p) be the spherical coordinates of a point referring to 
the center of the moving (21, y1, 21) frame as shown in the 
figure. Then the two masses m have spherical coordinates 
(r,0, 6) and (—r,0,p) in the moving-rod fixed frame. The 
frictionless constraints are holonomic. 

The kinetic energy of the system is equal to the kinetic energy for all the mass concentrated at the center 
of mass plus the kinetic energy about the center of mass. Since O is the center of mass then the kinetic 
energy can be separated into three terms 


Two identical masses m constrained to slide on 
a moving rod of mass M. The masses are 
attached to the center of the rod by identical 
springs each having a spring constant K. 


T= Tey + ¡pmasses + prod 


rot rot 


Note that since the kinetic energy is a scalar quantity it is rotational invariant and thus can be evaluated in 
any rotated frame. Thus the kinetic energy of the center of mass is 


1 
Tom = 5(M + 2m)(4? + y? + 27) 
The rotational kinetic energy of the two masses in the center of mass frame is 


2 
Trrasses — m(t? 4 120" + r2p? sin? 9) 


The rotational kinetic energy of the rod T"°¢ is a scalar and thus can be evaluated in any rotated frame of 


reference fixed with respect to the principal axis system of the rod. The angular velocity of the rod about O 
resolved along its principal axes is given by 
© = pcos08, — ý sin bey — bê, 


The corresponding moments of inertia of the uniform infinitesimally-thin rod are I, = 0,19 = SMP, Ip = 
SMP. Hence the rotational kinetic energy of the rod is 


1 Ps 
Trot = zw? + Tou + Tye) = Me + p sin? 0) 


The only potential energy is due to the two extended springs which are assumed to have the same length r 
where ro is the unstretched length. 


Thus the Lagrangian is 
1 ; 1 ; 
L= 5(M +2m)(a? +9? + 27) + mir? + 20 + 122 sin? 0) + ¿MB + p? sin? 6) — K(r — ro)? 


Using Lagrange’s equations Aq, L = 0 for the generalized coordinates gives. 
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(M+2m)z = constant (A, L = 0) 
(M+2m)y = constant (A, L = 0) 
(M+2m)z = constant (A-L =0) 
(2m? + Me) ysin?@ = constant (A, L = 0) 

. K 
ro —rgsin?64+—(r—ro) = 0 (A,.L = 0) 

m 

MEN . . mi? 
pi rO — | r? + — | sin0cosó = AgL = 

6 + qa) b+ arid 6 +e sin Ó cos 0 (Ao ) 


The first three equations show that the three components of the linear momentum of the center of mass 
are constants of motion. The fourth equation shows that the component of the angular momentum about 
the 2 axis is a constant of motion. Since the zı axis has been arbitrarily chosen then the total angular 
momentum must be conserved. The fifth and sixth equations give the radial and angular equations of motion 
of the oscillating masses m. 


6.9 Applications involving non-holonomic constraints 


In general, non-holonomic constraints can be handled by use of generalized forces Qr* C in the Lagrange- 
Euler equations 6.60. The following examples, 6.16 — 6.19, involve one-sided constraints which exhibit 
holonomic behavior for restricted ranges of the constraint surface in coordinate space, and this range is case 
specific. When the forces of constraint press the object against the constraint surface, then the system is 
holonomic, but the holonomic range of coordinate space is limited to situations where the constraint forces 
are positive. When the constraint force is negative, the object flies free from the constraint surface. In 
addition, when the frictional force F > NUstatio Where [static 18 the static coefficient of friction, then the 
object slides negating any rolling constraint that assumes static friction. 


6.16 Example: Mass sliding on a frictionless spherical shell 


Consider a mass starts from rest at the top of a frictionless 
fixed spherical shell of radius R. The questions are what is the 
force of constraint and determine the angle 0 at which the mass 
leaves the surface of the spherical shell. The coordinates r, 0 shown 
are the obvious generalized coordinates to use. The constraint will 
not apply if the force of constraint does not hold the mass against 
the surface of the spherical shell, that is, it is only holonomic in a 
restricted domain. 

The Lagrangian is 


1 2 
L= gm (7? +776 ) — mgr cos 0 

Mass m sliding on frictionless cylinder 
This Lagrangian is applicable irrespective of whether the constraint of radius R. 


is obeyed, where the constraint is given by 
g(r,0)=r=R=0 


For the restricted domain where this system is holonomic, it can be solved using generalized coordinates, 
generalized forces, Lagrange multipliers, or Newtonian mechanics as illustrated below. 

Minimal generalized coordinates: 

The minimal number of generalized coordinates reduces the system to one coordinate 0, which does not 
determine the constraint force that is needed to know if the constraint applies. Thus this approach is not 
useful for solving this partially-holonomic system. 
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Generalized forces: 
The radial constraint has a corresponding generalized force Qr. The Lagrange equation A,L = Q, gives 


2 
mir + mgcos9 — mr) = Qr (a) 
The Lagrange equation AgL = Qo = 0 since there is no tangential force for this frictionless system. Therefore 
mr? — mgr sinb + 2mrr0 = 0 (b) 


When constrained to follow the surface of the spherical shell, the system is holonomic, i.e. r = R and 
r=F=0. Thus the above two equations reduce to 


mg cos @ — mR” = Q; (c) 
mR?6—mgRsind = 0 
That is 
6= sino 
R 
Integrate to get 0 using the fact that : 
ga a g 
= dddt d0 
then 
fiw- f bai - Z [ sinodo 
Therefore 
2 2 
ù = + (1 — cos 0) (d) 


assuming that 0 = O at 0 = 0. Substituting equation (d) into equation (c) gives the constraint force, which 
is normal to the surface, to be 
F = Qr, = mg(3 cos 0 — 2) 
Note that F = Q, =0 when cos@ = 3, that is 0 = 48.2. 
Lagrange multipliers: 
For the holonomic regime, which obeys the constraint, g(r,0) =r —R=0, the Lagrange equation for r 
is A, L = AZ. Since go = 1, then 
mi + mg cos 6 — mr” = (a) 
The Lagrange equation for 0 gives AgL = AL = 0 since og =0. Thus 
mr? — mgr sinb + 2mrr0 = 0 (b) 


As above, when constrained to follow the surface of the spherical shell, the system is holonomic r = R, 
and t =F =0. Thus the above two equations reduce to 


lI 
— 
O 
WN 


12 
mg cos @ — mR0 
mR?6—mgRsind = 0 (d) 
That is, the answers are identical to that obtained using generalized forces, namely; 


+2 


0 = 2 (1 — cos8) (d) 


assuming that 0 = 0 at 0 =0. 
The force of constraint applied by the surface is 


3g _ 
F= =X 
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Substituting equation (d) into equation (c) gives 
F = à = mg(3 cos 0 — 2) 


Note that A= 0 when cos@ = 2, that is 0 = 48.2°. 

Both of the above methods give identical results and give that the force of constraint is negative when 
0 > 48.2°. Assuming that the surface cannot hold the mass against the surface, then the mass will fly off the 
spherical shell when 0 > 48.2° and the system reduces to an unconstrained object falling freely in a uniform 
gravitational field, which is holonomic, that is Q, = A = 0. Then the equations of motion (a) and (b) reduce 
to 


= 
oO 
NS 


2 
mi + mgcosð—- mr) = 


oo 
— 

+ 
a 


mr?0 — mgrsin0 + 2mrr0 


Energy conservation: 
This problem can be solved using energy conservation 


1 
¿ma = mgR[1 — cos 0] 


Thus the centripetal acceleration 


2 
> = 2g[1 — cos 0] 


The normal force to the surface will cancel when the centripetal acceleration equals the gravitational acceler- 
ation, that is, when 


q? 


— = 2g|1 — cos 8] = g cos 
R = 2a ]=9 
This occurs when cos? = 2, This is an unusual case where the Newtonian approach is the simplest. 


6.17 Example: Rolling solid sphere on a spherical shell 


This is a similar problem to the prior one with the added 
complication of rolling which is assumed to move in a vertical 


plane making it holonomic. Here we would like to determine 

the forces of constraint to see when the solid sphere flies off the v 
spherical shell and when the friction is insufficient to stop the H 
rolling sphere from slipping. S 


The best generalized coordinates are the distance of the center 
of the sphere from the center of the spherical shell, r,0 and ¢. 
It is important to note that @ is measured with respect to the 
vertical, not the time-dependent vector r. That is, the direction 
of the radius r is 0 which is time dependent and thus is not a 
useful reference to use to define the angle ġ. Let us assume 
that the sphere is uniform with a moment of inertia of I = 
Zma’. If the tangential frictional force F is less than the limiting — Disk of mass m, radius a, rolling on a 
value Nustatics, with N > 0, then the sphere will roll without cylindrical surface of radius R. 
slipping on the surface of the cylinder and both constraints apply. 
Under these conditions the system is holonomic and the solution is solved using Lagrange multipliers and the 
equations of constraint are the following: 
1) The center of the sphere follows the surface of the cylinder 


g=r—-R-a=0 
2) The sphere rolls without slipping 


ga =a(d—6)— RO=0 
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The kinetic energy is T = 4m (i? + 26°) + 114 and the potential energy is U = mgr cos@. Thus the 
Lagrangian is 


1 . 13 
L= zm (7 + 26°) + 19 — mgr cos 


Consider the solution using Lagrange multipliers for the holonomic regime where both constraints are 
satisfied and lead to the following differential constraint relations 


091 Og On 
or =>” Op a9 T 
092 0g2 _ Og2 _ 


The Lagrange operator equation A,L gives, 


Sa ee Se ea La oe 
dor r Or |? Gr 
that is 5 
mř + mgcosé6—mré =r, (a) 
AoL gives F i 
mr?0 + 2mr?0 — mgr sind = —A2 (R + a) (b) 
Ag L gives 
Iġ = aro (c) 
Since the center of the sphere rolling on the spherical shell must have 
r=R+a 
then 
t = F=0 
.. r .. 
6 = 3 
Substituting this into (c) gives 
2 
oe a 
0 Tr? 
Insert this into equation (b) gives 
mgr sin @ 


jena A 
(+25) 


The moment of inertia about the axis of a solid sphere is I = Zma. Then 


2mg sin 0 
No — 
But also : 
.d0 a 5 _ 5gsin0 
a idea rr? Imr Tr 


Integrating gives 


Jai = 2 f sinoao 
Tr 


Y = 09 (1 — cos 8) 
Tr 


That is 


assuming that 6 = 0 at 0 = 0. Inserting this into equation (a) gives 


10 
mr? [1 — cos 0] + mgcos@ = A 
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That is mi 
à = T [17 cos 0 — 10] 


Note that this equals zero when 
10 


17 
For larger angles 1 is negative implying that the solid sphere will fly off the surface of the spherical shell. 

The sphere will leave the surface of the cylinder when cos 0 = a that is, 0 = 53.97°. This is a significantly 
larger angle than obtained for the similar problem where the mass is sliding on a frictionless cylinder because 
the energy stored in rotation implies that the linear velocity of the mass is lower at a given angle 0 for the 
case of a rolling sphere. 

The above discussion has omitted an important fact that, if Metatic < 00, the frictional force becomes 
insufficient to maintain the rolling constraint before O = 53.97°, that is, the frictional force will exceed 
the sliding limit N Hstatie: To determine when the rolling constraint fails it is necessary to determine the 
frictional torque 


cos 0 = 


F;R= -AR 


Thus 
F; = —Ao 


It is in the negative direction because of the direction chosen for ġ. The required coefficient of friction u is 
given by the ratio of the frictional force to the normal force, that is 
A 2sin 0 


PX [17cos@ — 10] 


For y = 1 the disk starts to slip when 0 = 47.54%. Note that the sphere starts slipping before it flies off 
the cylinder since a normal force is required to support a frictional force and the difference depends on the 
coefficient of friction. The no-slipping constraint is not satisfied once the sphere starts slipping and the 
frictional force should equal Ukineticài. Thus for the angles beyond 47.54” the problem needs to be solved with 
the rolling constraint changed to a sliding non-conservative frictional force. This is best handled by including 
the frictional force and normal forces as generalized forces. Fortunately this will be a small correction. The 
friction will slightly change the exact angle at which the normal force becomes zero and the system transitions 
to free motion of the sphere in a gravitational field. 


6.18 Example: Solid sphere rolling plus slipping on a spherical shell 


Consider the above case when the frictional force is insufficient to constrain the motion to rolling. Now 
the frictional force F is given by 
F= N Hstiding 


when N is positive. 
This can be solved using generalized forces with the previous Lagrangian. Then 


which gives 
+2 
mr +mgcos0 —mró0 =N 
Similarly AgL = Qo = —F (R + a) gives 
mr?6 + 2mrr0 — mgr sinb = —F (R +a) 
Similarly AgL = Qg =aF gives i 
I¢ = aF 


These can be solved by substituting the relation F = Nustiding- The sphere flies off the spherical shell 
when N < 0 leading to free motion discussed in example 6.2. The problem of a solid uniform sphere rolling 
inside a hollow sphere can be solved the same way. 


166 CHAPTER 6. LAGRANGIAN DYNAMICS 


6.19 Example: Small body held by friction on the periphery of a rolling wheel 


Assume that a small body of mass m is bal- 
anced on a rolling wheel of mass M and radius 
R as shown in the figure. The wheel rolls in 
a vertical plane without slipping on a horizontal 
surface. This example illustrates that it is possi- 
ble to use simultaneously a mixture of holonomic 
constraints, partially-holonomic constraints, and 
generalized forces.* 

Assume that at t = 0 the wheel touches the 
floor at x = y = 0 with the mass perched at 
the top of the wheel at x = 0. Let the frictional 
force acting on the mass m be F and the reaction 
force of the periphery of the wheel on the mass 
be N. Let p be the angular velocity of the wheel, 
and & the horizontal velocity of the center of the 
wheel. The polar coordinates r,0 of the mass m 
are taken with r measured from the center of the 
wheel with 0 measured with respect to the vertical. 
Thus the cartesian coordinates of the small mass Small body of mass m held by friction on the periphery 
m are (x+rsin0, R+rc080) with respect to the of a rolling wheel of mass M and radius R. 
origin at x =y =Q. 


The kinetic energy is given by 


T= l tg? + Ls + Li (4 +r0cos0++sin0) + (*cos0—rÓsino) 
2 2 2 


The gravitational force can be absorbed into the scalar potential term of the Lagrangian and includes only 
the potential energy of the mass m since the potential energy of the rolling wheel is constant. 


U = +mg(R+rc0s0) 


Thus the Lagrangian is 


1 1 1. poa; 
L=5(M+m)i? +31% + 5m [e + 2ré6 cos O + 2é7 sin + 72] — mg (R + r cos0) 


The equations of constraints are: 
1) The wheel rolls without slipping on the ground plane leading to a holonomic constraint: 


gı =x-=Re=t-Rp=0 


2) The mass m is touching the periphery of the wheel, that is, the normal force N > 0. This is a one-sided 
restricted holonomic constraint. 


gg=R-r=0 


3) The mass m does not slip on the wheel if the frictional force F < Nugtatic. When this restricted 
holonomic constraint is satisfied, then 
gs =9-p=0 
The rolling constraint is holonomic, and can be accounted for using one Lagrange multiplier A, plus the 
differential constraint equations 


3This problem is solved in detail in example 3.19 of " Classical Mechanics and Relativity". by Muller-Kirsten [Mu06] . 


6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 167 


Be 
an ° 
ae 
2 =0 


The other two constraints are non-holonomic, and thus these constraint forces are expressed in terms of two 
generalized forces Qg, and Q, that are related to the tangential force F and radial reaction force N. For 
simplicity, assume that the wheel is a thin-walled cylinder with a moment of inertia of 


I= MR? 


The Euler-Lagrange equations for the four coordinates x,0,p,r are 


= ((M +m) t+ mrdcos6 +i sin®) +A» + Qu = 0 (Az) 

mrió sinó + tř cos @ — mgr sind — E (mr? + mrt cos 9) +Q = 0 (Ap) 
d l 

=r (MR’y)- RA, = 0 (Ao) 

—mg cos 6 — (má sin0+7)+Q, = 0 (Ar) 


The generalized forces can be related to F and N using the definition 


Or 


Qar = Pa. 


where F(r) is the vectorial sum of the forces acting at r. The components of vector r = (x + r sin 0, R + r cos 0) 
and F, and N are in the directions defined in the figure which leads to the generalized forces 


Q: = —Fcos + N sind 
Qo = (—Fcos6é+ Nsin@)(—Rcosé) — (Fsin0 + Ncos0) Rsind = -FR 
Qr = N 


Solving the above 7 equations gives that 
2 
misin0 + mR0 —mgcosd?+N =0 


This last equation can be derived by Newtonian mechanics from consideration of the forces acting. 

The above equations of motion can be used to calculate the motion for the following conditions. 

a) Mass not slipping: 

This occurs if u = E < Ustatic Which also implies that N > 0, That is a situation where the system is 
holonomic with r = R, t = Rọ, 0 = p which can be solved using the generalized coordinate approach with 
only one independent coordinate which can be taken to be 0. 

b) Mass slipping: 

Here the no-slip constraint is violated and thus one has to explicitly include the generalized forces Qr, Qo, Qo 
and assume that sliding friction is given by F = Nustiding: 

c) Reaction force N is negative: 

Here the mass is not subject to any constraints and it is in free fall. 


The above example illustrates the flexibility provided by Lagrangian mechanics that allows simultane- 
ous use of Lagrange multipliers, generalized forces, and scalar potential to handle combinations of several 
holonomic and nonholonomic constraints for a complicated problem. 
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6.10 Velocity-dependent Lorentz force 


The Lorentz force in electromagnetism is unusual in that it is a velocity-dependent force, as well as being a 
conservative force that can be treated using the concept of potential. That is, the Lorentz force is 


F=q(E+vxB) (6.61) 


It is interesting to use Maxwell's equations and Lagrangian mechanics to show that the Lorentz force can be 
represented by a conservative potential in Lagrangian mechanics. 
Maxwell's equations can be written as 


VRS ¿E (6.62) 
€0 
OB 
V x E+— = 0 
x + 
V-B = 0 
OE 
V x B-m 5, = J 


Since V - B =0 then it follows from Appendix A that B can be represented by the curl of a vector 
potential, A, that is 
B=VxA (6.63) 


Substituting this into V x E+B = 0 gives that 


A 
VEA e (6.64) 
Ot 
OA 
Vx (E+ — = 
x ( 5 Ot ) 
Since this curl is zero it can be represented by the gradient of a scalar potential U 
OA 
E+ ae 7 -VU (6.65) 


The following shows that this relation corresponds to taking the gradient of a potential U for the charge q 
where the potential U is given by the relation 


U=q(®-A-v) (6.66) 


where ® is the scalar electrostatic potential. This scalar potential U can be employed in the Lagrange 
equations using the Lagrangian 


1 
L=5mv-v—q(®—A-v) (6.67) 


The Lorentz force can be derived from this Lagrangian by considering the Lagrange equation for the cartesian 
coordinate x 


d OL ƏL 
oS =0 (6.68) 


Using the above Lagrangian (6.67) gives 
dA, O® OA 
‘Vv 


mdz + q Ji + a Oi =0 (6.69) 
But dA, ðA, ðA.. ðA.. ðA 
u of + Da t+ Oy yt ae + (6.70) 
ue JA ðA. ðA, OA 
v=- t + ay t+ 4 (6.71) 
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Inserting equations 6.70 and 6.71 into 6.69 gives 


. bb ..6AN OA IAN OA DAN 
= — = E B . 2 
¿las a|( On me) + (Fe E (= E) | di: Er) 


Corresponding expressions can be obtained for Fy and F,. Thus the total force is the well-known Lorentz 
force 


F=q(E+vxB) (6.73) 


This has demonstrated that the electromagnetic scalar potential 
U=q(®-A-v) (6.74) 


satisfies Maxwell’s equations, gives the Lorentz force, and it can be absorbed into the Lagrangian. Note that 
the velocity-dependent Lorentz force is conservative since E is conservative, and because (v x B x v)dt=0, 
therefore the magnetic force does no work since it is perpendicular to the trajectory. The velocity-dependent 
conservative Lorentz force is an important and ubiquitous force that features prominently in many branches 
of science. It will be discussed further for the case of relativistic motion in example 17.6. 


6.11 Time-dependent forces 


All examples discussed in this chapter have assumed Lagrangians that are time independent. Mathematical 
systems where the ordinary differential equations do not depend explicitly on the independent variable, which 
in this case is time t, are called autonomous systems. Systems having differential equations governing the 
dynamical behavior that have time-dependent coefficients are called non-autonomous systems. 

In principle it is trivial to incorporate time-dependent behavior into the equations of motion by intro- 
ducing either a time dependent generalized force Q(r, t), or allowing the Lagrangian to be time dependent. 
For example, in the rocket problem the mass is time dependent. In some cases the time dependent forces 
can be represented by a time-dependent potential energy rather than using a generalized force. Solutions 
for non-autonomous systems can be considerably more difficult to obtain, and can involve regions where the 
motion is stable and other regions where the motion is unstable or chaotic similar to the behavior discussed 
in chapter 4. The following case of a simple pendulum, whose support is undergoing vertical oscillatory 
motion, illustrates the complexities that can occur for systems involving time-dependent forces. 


6.20 Example: Plane pendulum hanging from a vertically-oscillating support 


Consider a plane pendulum having a mass M fastened to a massless rigid rod of length L that is at an 
angle O(t) to the vertical gravitational field g. The pendulum is attached to a support that is subject to a 
vertical oscillatory force F such that the vertical position y of the support is 


y = Acoswt 
The kinetic energy is 
1 A 2 . an 2 1 272 Dely -2 
T=5M (Lò cos) + (y + Lósin0) = 5M [1% 4 2Lbysind + y 


and the potential energy is 
U = Mg{L(1 — cos@) + y] 


Thus the Lagrangian is 
1 i : 
L=5M [176° + 2L6y sin 0 + y?| — Mg [L(1 — cos 0) + y] 
The Euler-Lagrange equations lead to equations of motion for 0 and y 


MEL? + MLjsin0 + MgLsinð = 0 
Mlsin0 + MLO cos0+ Mj+ Mg = F 
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Assume the small-angle approximation where 0 — 0, then these two equations reduce to 


d+ (L+2)o = 0 


L'L 
A Je: 
YtI = M 
Substitute ij = —Aw? coswt into these equations gives 
Aw? 
b+ (2 =F coswt) 0 = 0 
M (9- Aw?’ coswt) = F 


These correspond to stable harmonic oscillations about 0 = 0 if the bracket term is positive, and to 
unstable motion if the bracket is negative. Thus, for small amplitude oscillation about 0 ~ 0 the motion of 
the system can be unstable whenever the bracket is negative, that is, when the acceleration Aw? coswt > y 
and resonance behavior can occur coupling the pendulum period and the forcing frequency w. 

This discussion also applies to the inverted pendulum with a surprising result. It is well known that the 
pendulum is unstable near 0 = n. However, if the support is oscillating, then for 0 ~ m the equations of 


motion become 
N eae lu? 
0 (2 T, cos wt) 0 


m (g = Aw? cos wt) = A 


0 


The inverted pendulum has stable oscillations about 0 = m if the bracket is negative, that is, if Aw? coswt > g. 
This illustrates that nonautonomous dynamical systems can involve either stable or unstable motion. 


6.12 Impulsive forces 


Colliding bodies often involve large impulsive forces that act for a short time. As discussed in chapter 2.12.8, 
the treatment of impulsive forces or torques is greatly simplified if they act for a sufficiently short time that 
the displacement during the impact can be ignored, even though the instantaneous change in velocities may 
be large. The simplicity is achieved by taking the time integral of the Euler-Lagrange equations over the 
duration 7 of the impulse and assuming T — 0. 

The impact of the impulse on a system can be handled two ways. The first approach is to use the 
Euler-Lagrange equation during the impulse to determine the equations of motion 


d OLN: Ob axe 
a (ai) ~ 3g 70 iia 


where the impulsive force is introduced using the generalized force Qe. C. Knowing the initial conditions at 
time t, the conditions at the time t + T are given by integration of equation 6.75 over the duration 7 of the 


impulse which gives 
t+r d ( OL ) t+r OL t+r 
— | dr f dr = f QPXC dr 6.76 
I dt \ 0q; + 94; t i E 


This integration determines the conditions at time t +7 which then are used as the initial conditions for the 
motion when the impulsive force QPZ C is zero. 
The second approach is to realize that equation 6.76 can be rewritten in the form 


t+T = x o j t+T ƏL gules 
= Ap; = lim + Q; dr (6.77) 
t T>0 t 09; 


tF 
lim a eb dt = lim ae 
t0 J, dt \ 0g; 70 0d; 


Note that in the limit that 7 — 0 then the integral of the generalized momentum p; = de simplifies to give 
OL 


the change in generalized momentum Ap,. In addition, assuming that the non-impulsive forces (#4) are 
I 
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finite and independent of the instantaneous impulsive force during the infinitessimal duration 7, then the 


contribution of the non-impulsive forces f; +7 (24) dr during the impulse can be neglected relative to the 
a$ 


large impulsive force term; lim,_,9 ip ons Qr* Cdr. Thus it can be assumed that 


EFE 


Ap; = lim OF a7 =Q; (6.78) 
E t 


where Q; is the generalized impulse associated with coordinate j = 1,2,3,....,n. This generalized impulse 
can be derived from the time integral of the impulsive forces P; given by equation 2.135 using the time 
integral of equation 6.77, that is 


5 “exc mee Or; ~ Or; 
Ap; =O; = lim, | Qi dr = lim f La, (6.79) 
a 2 

Note that the generalized impulse Q; can be a translational impulse P; with corresponding translational 
variable qj, or an angular impulsive torque 7; with corresponding angular variable @,. 

Impulsive force problems usually are solved in two stages. Either equations 6.76 or 6.79 are used to 
determine the conditions of the system immediately following the impulse. If r — 0 then impulse changes 
the generalized velocities q, but not the generalized coordinates q;. The subsequent motion then is determined 
using the Lagrangian equations of motion with the impulsive generalized force being zero, and assuming that 
the initial condition corresponds to the result of the impulse calculation. 


6.21 Example: Series-coupled double pendulum subject to impulsive force 


Consider a series-coupled double pendulum comprising 
two masses m, and ma connected by rigid massless rods of 
lengths Lı and Lə as shown in the figure. Initially the two 
pendula are at rest and hanging vertically when a horizontal 
impulse P strikes the system at a distance D below the up- 
per fulcrum where Ly < D < Lı + Lə. For this system the 
kinetic energy of the masses mı and ma are 


Lab 
Tn = ¿mié Libs 
A S 
TI = smalLid, + 2L1 L212 cos(b, — $2) + L3 TA 
Note the velocity of ma is the vector sum of the two velocities Two series-coupled plane pendula. 
shown, separated by the angle p,— 1. Thus the total kinetic 
energy is 


1 2 oe 
T = 5 (mı + ma)Li9) + m2L1L20103c0s(b, — b2) + smal ba 
To first order in cos(¢, — $9) 
1 
T = ¿(mu + m2)L 25, + malaLob, 0, + smal da 


The total potential energy is 


U 


migL1(1— cos ¢,) + mag[L1(1 — cos ġ1) + La(1 — cos dy) 
= (mı +m2)gLı(1 — cosp,) + m2gLa(1 — cos bg) 


Thus, assuming the small-angle approximation, the Lagrangian becomes 


1 1 
L= ¿(Mm +ma)L34; + mol Lb, + Imz? a= (5 (mı + ma)gL10i + iagla0) 
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Use equation 6.79 to transform to the generalized coordinates $, and pa with the corresponding generalized 
impulsive torques 
Or = Ph 


Q2 = P(D- Li) 


Since the system starts at rest where >, = f3 = 0, then using equation 6.77 gives the change in angular 
momentum immediately following the impulse to be 


miL?ó, + mol, (11d, + Lob») = Phi 


pass (216, + Lobo) = Pipe 


These two equations determine 1 and do immediately after the impulse; these can be used with fp, =P, =0 
as initial conditions for solving the subsequent force-free motion when the generalized impulsive force is zero. 

As described in example 14.5, the subsequent motion of this series coupled pendulum will be a superposition 
of the two normal modes with amplitudes determined by the result of the impulse calculation. 


6.13 The Lagrangian versus the Newtonian approach to classical 
mechanics 


It is useful to contrast the differences, and relative advantages, of the Newtonian and Lagrangian formulations 
of classical mechanics. The Newtonian force-momentum formulation is vectorial in nature, it has cause and 
effect embedded in it. The Lagrangian approach is cast in terms of kinetic and potential energies which involve 
only scalar functions and the equations of motion come from a single scalar function, i.e. Lagrangian. The 
directional properties of the equations of motion come from the requirement that the trajectory is specified 
by the principle of least action. The directional properties of the vectors in the Newtonian approach assist 
in our intuition when setting up a problem, but the Lagrangian method is simpler mathematically when the 
mechanical system is more complex. 

The major advantage of the variational approaches to mechanics is that solution of the dynamical equa- 
tions of motion can be simplified by expressing the motion in terms of independent generalized coordi- 
nates. For Lagrangian mechanics these generalized coordinates can be any set of independent variables, 
qi, where 1 < i < n, plus the corresponding velocities q¢;. These independent generalized coordinates 
completely specify the scalar potential and kinetic energies used in the Lagrangian or Hamiltonian. The 
variational approach allows for a much larger arsenal of possible generalized coordinates than the typical 
vector coordinates used in Newtonian mechanics. For example, the generalized coordinates can be dimension- 
less amplitudes for the N normal modes of coupled oscillator systems, or action-angle variables. Moreover, 
very different generalized coordinates can be used for each of the n variables. The tremendous freedom 
plus flexibility of the choice of generalized coordinates is important when constraint forces are acting on the 
system. Generalized coordinates allow the constraint forces to be ignored by including auxiliary conditions 
to account for the kinematic constraints that lead to correlated motion. The Lagrange method provides 
an incredibly consistent and mechanistic problem-solving strategy for many-body systems subject to con- 
straints. Expressed in terms of generalized coordinates, the Lagrange’s equations can be applied to a wide 
variety of physical problems including those involving fields. The manipulation of scalar quantities in a 
configuration space of generalized coordinates can greatly simplify problems compared with being confined 
to a rigid orthogonal coordinate system characterized by the Newtonian vector approach. 

The use of generalized coordinates in Lagrange’s equations of motion can be applied to a wide range 
of physical phenomena including field theory, such as for electromagnetic fields, which are beyond the ap- 
plicability of Newton’s equations of motion. The superiority of the Lagrangian approach compared to the 
Newtonian approach for solving problems in mechanics is apparent when dealing with holonomic constraint 
forces. Constraint forces must be known and included explicitly in the Newtonian equations of motion. Un- 
fortunately, knowledge of the equations of motion is required to derive these constraint forces. For holonomic 
constrained systems, the equations of motion can be solved directly without calculating the constraint forces 
using the minimal set of generalized coordinate approach to Lagrangian mechanics. Moreover, the Lagrange 
approach has significant philosophical advantages compared to the Newtonian approach. 
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6.14 Summary 


Newtonian plausibility argument for Lagrangian mechanics: 

A justification for introducing the calculus of variations to classical mechanics becomes apparent when 
the concept of the Lagrangian L = T — U is used in the functional and time t is the independent variable. 
It was shown that Newton’s equation of motion can be rewritten as 


dab OL _ px 
dt 0d; ðq; 


(6.12) 


where REX are the excluded forces of constraint plus any other conservative or non-conservative forces not 
included in the potential U. This corresponds to the Euler-Lagrange equation for determining the minimum 
of the time integral of the Lagrangian. Equation 6.12 can be written as 


d OL ƏL de 09 mee 
on OL 2% O gg + Fa (6.15) 


where the Lagrange multiplier term accounts for holonomic constraint forces, and F, io C includes all ad- 
ditional forces not accounted for by the scalar potential U, or the Lagrange multiplier terms E C. The 
constraint forces can be included explicitly as generalized forces in the excluded term F, E C of equation 
6.15. 

d’Alembert’s Principle 

It was shown that d’Alembert’s Principle 


N 


> (F$ — pi): dr; = 0 (6.25) 


i 


cleverly transforms the principle of virtual work from the realm of statics to dynamics. Application of virtual 
work to statics primarily leads to algebraic equations between the forces, whereas d’Alembert’s principle 
applied to dynamics leads to differential equations of motion. 

Lagrange equations from d’Alembert’s Principle 

After transforming to generalized coordinates, d’Alembert’s Principle leads to 


y [€ (5) >. | a] dq; =0 (6.38) 


If all the n generalized coordinates q; are independent, then equation 6.38 implies that the term in the square 
brackets is zero for each individual value of j. That is, this implies the basic Euler-Lagrange equations of 
motion. 

The handling of both conservative and non-conservative generalized forces Q; is best achieved by assuming 
that the generalized force Q; = 7 FA. at can be partitioned into a conservative velocity-independent term, 


that can be expressed in terms of the gradient of a scalar potential, —WU;, plus an excluded generalized force 
QA which contains the non-conservative, velocity-dependent, and all the constraint forces not explicitly 
included in the potential U;. That is, 


Q; =-VU; + Q7* (6.41) 
Inserting (6.41) into (6.38), and assuming that the potential U is velocity independent, allows (6.38) to be 
rewritten as d (A U) al U) 
LS T= EX 
A aa ee N A 6.42 
la) a pe (ean 


Expressed in terms of the standard Lagrangian L = T — U this gives 


N 


2 16 (sr) g 7 | 7 op] OG) (6.44) 
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Note that equation (6.44) contains the basic Euler-Lagrange equation (6.38) for the special case when 
U = 0. In addition, note that if all the generalized coordinates are independent, then the square bracket 
terms are zero for each value of j, which leads to the n general Euler-Lagrange equations of motion 


d (OL OL) Ex 
(a (5) y a 0 vee 
where n> j > 1. 


Newtonian mechanics has trouble handling constraint forces because they lead to coupling of the degrees 
of freedom. Lagrangian mechanics is more powerful since it provides the following three ways to handle such 
correlated motion. 

1) Minimal set of generalized coordinates 

If the n coordinates q; are independent, then the square bracket equals zero for each value of j in equation 
6.44, which corresponds to Euler’s equation for each of the n independent coordinates. If the n generalized 
coordinates are coupled by m constraints, then the coordinates can be transformed to a minimal set of 
s = n — m independent coordinates which then can be solved by applying equation 6.45 to the minimal set 
of s independent coordinates. 

2) Lagrange multipliers approach 

The Lagrangian method concentrates solely on active forces, completely ignoring all other internal forces. 
In Lagrangian mechanics the generalized forces, corresponding to each generalized coordinate, can be parti- 
tioned three ways 
dgr 


Qj =-VU +X Ak 3a; 


k=1 


(q,t) + QÍ%0 


where the velocity-independent conservative forces can be absorbed into a scalar potential U, the holonomic 

constraint forces can be handled using the Lagrange multiplier term Ny, MeSH (a, t), and the remaining 

part of the active forces can be absorbed into the generalized force Q7* C. The scalar potential energy U is 

handled by absorbing it into the standard Lagrangian L = T—U. If the constraint forces are holonomic then 

these forces are easily and elegantly handled by use of Lagrange multipliers. All remaining forces, including 

dissipative forces, can be handled by including them explicitly in the the generalized force q e 
Combining the above two equations gives 


Ly m 
d (OL OL EXC _ gx o 
2 { dt (a) 09; ) Q3 2 Àk ôq; (a, t)| 945 = 0 (6.56) 


J 


Use of the Lagrange multipliers to handle the m constraint forces ensures that all n infinitessimals dq; are 
independent implying that the expression in the square bracket must be zero for each of the n values of 7. 
This leads to n Lagrange equations plus m constraint relations 


d (OL ƏL) ¿Exc o, 09% 
(alar) g +A sea) (6.60) 


k=1 


where j = 1, 2,3,...n. 

3) Generalized forces approach 

The two right-hand terms in (6.60) can be understood to be those forces acting on the system that are 
not absorbed into the scalar potential U component of the Lagrangian L. The Lagrange multiplier terms 
pier ar32 (q, t) account for the holonomic forces of constraint that are not included in the conservative 


potential or in the generalized forces QP C. The generalized force 


n 


QPI = SRE: a (6.17) 
i dj 

is the sum of the components in the q; direction for all external forces that have not been taken into account 

by the scalar potential or the Lagrange multipliers. Thus the non-conservative generalized force QP e 

contains non-holonomic constraint forces, including dissipative forces such as drag or friction, that are not 

included in U, or used in the Lagrange multiplier terms to account for the holonomic constraint forces. 
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Applying the Euler-Lagrange equations in mechanics: 
The optimal way to exploit Lagrangian mechanics is as follows: 


1. Select a set of independent generalized coordinates. 


2. Partition the active forces into three groups: 


(a) Conservative one-body forces 
(b) Holonomic constraint forces 


(c) Generalized forces 


3. Minimize the number of generalized coordinates. 
4. Derive the Lagrangian 


5. Derive the equations of motion 


Velocity-dependent Lorentz force: 

Usually velocity-dependent forces are non-holonomic. However, electromagnetism is a special case where 
the velocity-dependent Lorentz force F = q(E+v x B) can be obtained from a velocity-dependent potential 
function U(q, q,t). It was shown that the velocity-dependent potential 


U=qB-qv-A (6.74) 


leads to the Lorentz force where ® is the scalar electric potential and A the vector potential. 

Time-dependent forces: 

It was shown that time-dependent forces can lead to complicated motion having both stable regions and 
unstable regions of motion that can exhibit chaos. 

Impulsive forces: 

A generalized impulse Q; can be derived for an instantaneous impulsive force from the time integral of 
the impulsive forces P; given by equation 2.135 using the time integral of equation 6.78, that is 


A z E EXC e ae Or; = Or; 
Ap; = Qj = lim Qj“ Cdr = lim 2 Pr = Pi a (6.79) 
i J 4 J 


Note that the generalized impulse Q; can be a translational impulse P, with corresponding translational 
variable q; or an angular impulsive torque T; with corresponding angular variable ġ;. 

Comparison of Newtonian and Lagrangian mechanics: 

In contrast to Newtonian mechanics, which is based on knowing all the vector forces acting on a system, 
Lagrangian mechanics can derive the equations of motion using generalized coordinates without requiring 
knowledge of the constraint forces acting on the system. Lagrangian mechanics provides a remarkably 
powerful, and incredibly consistent approach to solving for the equations of motion in classical mechanics, 
and is especially powerful for handling systems that are subject to holonomic constraints. 
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Workshop exercises 


1. A disk of mass M and radius R rolls without slipping down a plane inclined from the horizontal by an angle 
a. The disk has a short weightless axle of negligible radius. From this axis is suspended a simple pendulum of 
length l < R and whose bob has a mass m. Assume that the motion of the pendulum takes place in the plane 
of the disk. 


(a) What generalized coordinates would be appropriate for this situation? 
(b) Are there any equations of constraint? If so, what are they? 


(c) Find Lagrange’s equations for this system. 


2. A Lagrangian for a particular system can be written as 


K 
T= = (aż? + 2biry + ej?) — Z (ax? + 2bay + cy”) 


where a,b, and c are arbitrary constants, but subject to the condition that b? — 4ac #0. 

(a) 

(b) 

(c) 

(d) Based on your answers to (b) and (c), determine the physical system represented by the Lagrangian given 
above. 


What are the equations of motion? 
Examine the case a = 0 = c. What physical system does this represent? 


Examine the case b = 0 and a = —c. What physical system does this represent? 


3. Consider a particle of mass m moving in a plane and subject to an inverse square attractive force. 


(a) Obtain the equations of motion. 
(b) Is the angular momentum about the origin conserved? 
(c) Obtain expressions for the generalized forces. Recall that the generalized forces are defined by 


Ox; 
Q; B Bor 


2 


4. Consider a Lagrangian function of the form L(qi, di, Qi, t). Here the Lagrangian contains a time derivative 
of the generalized coordinates that is higher than the first. When working with such Lagrangians, the term 
“generalized mechanics” is used. 


(a) Consider a system with one degree of freedom. By applying the methods of the calculus of variations, 
and assuming that Hamilton's principle holds with respect to variations which keep both q and q fixed at 
the end points, show that the corresponding Lagrange equation is 


(aL) d (AL) 9L_, 
di \ag) ao)" aq 


Such equations of motion have interesting applications in chaos theory. 


(b) Apply this result to the Lagrangian 


Do you recognize the equations of motion? 


5. A bead of mass m slides under gravity along a smooth wire bent in the shape of a parabola x? = az in the 
vertical (x, z) plane. 


(a) What kind (holonomic, nonholonomic, scleronomic, rheonomic) of constraint acts on m? 


(b) Set up Lagrange’s equation of motion for x with the constraint embedded. 


6.14. SUMMARY 177 


(c) Set up Lagrange’s equations of motion for both x and z with the constraint adjoined and a Lagrangian 
multiplier A introduced. 


(d) Show that the same equation of motion for x results from either of the methods used in part (b) or part 
(c). 
(e) Express À in terms of x and @. 


(£) What are the x and z components of the force of constraint in terms of x and &? 


6. Consider the two Lagrangians 


j : ; dF(q,t 
L(q,ġ;t) and L'(q,q;t) = L(q,q;t) + Tan 


where F(q,t) is an arbitrary function of the generalized coordinates q(t). Show that these two Lagrangians 
yield the same Euler-Lagrange equations. As a consequence two Lagrangians that differ only by an exact time 
derivative are said to be equivalent. 


7. Consider the double pendulum comprising masses m1 and ma connected by inextensible strings as shown in 
the figure. Assume that the motion of the pendulum takes place in a vertical plane. 


(a) Are there any equations of constraint? If so, what are they? 


(b) Find Lagrange’s equations for this system. 


8 Consider the system shown in the figure which consists of a mass m suspended via a constrained massless link 
of length L where the point A is acted upon by a spring of spring constant k. The spring is unstretched when 
the massless link is horizontal. Assume that the holonomic constraints at A and B are frictionless. 


a Derive the equations of motion for the system using the method of Lagrange multipliers. 


“< 


dy 


9 Consider a pendulum, with mass m, connected to a (horizontally) moveable support of mass M. 


a) Determine the Lagrangian of the system. 


(a) 
(b) 
(c) 

) 


(d) What is the frequency of oscillation for M > m? Does this make sense? 


Determine the equations of motion for 0 << 1. 


Find an equation of motion in 0 alone. What is the frequency of oscillation? 
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Problems 


1. A sphere of radius p is constrained to roll without slipping on the lower half of the inner surface of a hollow 
cylinder of radius R. Determine the Lagrangian function, the equation of constraint, and the Lagrange equations 
of motion. Find the frequency of small oscillations. 


2. A particle moves in a plane under the influence of a force f = —Ar®—! directed toward the origin; A and 
a (> 0) are constants. Choose generalized coordinates with the potential energy zero at the origin. 
a) Find the Lagrangian equations of motion. 
b) Is the angular momentum about the origin conserved? 
c) Is the total energy conserved? 
3. Two blocks, each of mass M, are connected by an extensionless, uniform string of length l. One block is placed 


on a frictionless horizontal surface, and the other block hangs over the side, the string passing over a frictionless 
pulley. Describe the motion of the system: 


a) when the mass of the string is negligible 
b) when the string has mass m. 

4. Two masses Mı and ma (Mı Æ ma) are connected by a rigid rod of length d and of negligible mass. An 
extensionless string of length lı is attached to m and connected to a fixed point of the support P. Similarly 


a string of length la (lı 4 l2) connects ma and P. Obtain the equation of motion describing the motion in 
the plane of m1, M2, and P, and find the frequency of small oscillation around the equilibrium position. 


5. A thin uniform rigid rod of length 2L and mass M is suspended by a massless string of length l. Initially the 
system is hanging vertically downwards in the gravitational field g. Use as generalized coordinates the angles 
given in the diagram. 


a) Derive the Lagrangian for the system. 
b) Use the Lagrangian to derive the equations of motion. 


c) A horizontal impulsive force Fy in the x direction strikes the bottom end of the rod for an infinitessimal 
time T. Derive the initial conditions for the system immediately after the impulse has occurred. 


d) Draw a diagram showing the geometry of the pendulum shortly after the impulse when the displacement 
angles are significant. 


Chapter 7 


Symmetries, Invariance and the 
Hamiltonian 


7.1 Introduction 


The chapter 7 discussion of Lagrangian dynamics illustrates the power of Lagrangian mechanics for deriving 
the equations of motion. In contrast to Newtonian mechanics, which is expressed in terms of force vectors 
acting on a system, the Lagrangian method, based on d’Alembert’s Principle or Hamilton’s Principle, is 
expressed in terms of the scalar kinetic and potential energies of the system. The Lagrangian approach is a 
sophisticated alternative to Newton’s laws of motion, that provides a simpler derivation of the equations of 
motion that allows constraint forces to be ignored. In addition, the use of Lagrange multipliers or generalized 
forces allows the Lagrangian approach to determine the constraint forces when these forces are of interest. 
The equations of motion, derived either from Newton’s Laws or Lagrangian dynamics, can be non-trivial to 
solve mathematically. It is necessary to integrate second-order differential equations, which for n degrees of 
freedom, imply 2n constants of integration. 

Chapter 7 will explore the remarkable connection between symmetry and invariance of a system under 
transformation, and the related conservation laws that imply the existence of constants of motion. Even 
when the equations of motion cannot be solved easily, it is possible to derive important physical principles 
regarding the first-order integrals of motion of the system directly from the Lagrange equation, as well as for 
elucidating the underlying symmetries plus invariance. This property is contained in Noether’s theorem 
which states that conservation laws are associated with differentiable symmetries of a physical system. 


7.2 Generalized momentum 


Consider a holonomic system of N masses under the influence of conservative forces that depend on position 
qj but not velocity q;, that is, the potential is velocity independent. Then for the x coordinate of particle 7 
for N particles 


OL OT OU OT 


— = : —- => 7.1 
N 
ð 1 „2 52, 32 
= Miti = Dia 
Thus for a holonomic, conservative, velocity-independent potential we have 
OL 
Ba, T P (7.2) 


which is the x component of the linear momentum for the i*” particle. 
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This result suggests an obvious extension of the concept of momentum to generalized coordinates. The 
generalized momentum associated with the coordinate q; is defined to be 


OL 

> Dj (7.3) 
Note that p; also is called the conjugate momentum or canonical momentum to q; where qj, pj are 
conjugate, or canonical, variables. Remember that the linear momentum p; is the first-order time integral 
given by equation 2.10. If qj is not a spatial coordinate, then pj is the generalized momentum, not the 
kinematic linear momentum. For example, if q; is an angle, then p; will be angular momentum. That 
is, the generalized momentum may differ from the usual linear or angular momentum since the definition 
(7.3) is more general than the usual p, = mz definition of linear momentum in classical mechanics. This is 
illustrated by the case of a moving charged particles m;,e; in an electromagnetic field. Chapter 6 showed 
that electromagnetic forces on a charge e; can be described in terms of a scalar potential U; where 


Uj =e;(® — A- vj) (7.4) 


Thus the Lagrangian for the electromagnetic force can be written as 
am gl 
j=l 


The generalized momentum to the coordinate x; for charge e;, and mass my, is given by the above Lagrangian 


OL i 
Ham g Tt ejAx (7.6) 


Note that this includes both the mechanical linear momentum plus the correct electromagnetic momentum. 
The fact that the electromagnetic field carries momentum should not be a surprise since electromagnetic 
waves also carry energy as is illustrated by the transmission of radiant energy from the sun. 


7.1 Example: Feynman's angular-momentum paradox 


Feynman[Fey84] posed the following paradox. A circular insulating disk, mounted on frictionless bearings, 
has a circular ring of total charge q uniformly distributed around the perimeter of the circular disk at the 
radius R. A superconducting long solenoid of radius s, where s < R, is fixed to the disk and is mounted 
coaxial with the bearings. The moment of inertia of the system about the rotation axis is I. Initially the disk 
plus superconducting solenoid are stationary with a steady current producing a uniform magnetic field Bo 
inside the solenoid. Assume that a rise in temperature of the solenoid destroys the superconductivity leading 
to a rapid dissipation of the electric current and resultant magnetic field. Assume that the system is free to 
rotate, no other forces or torques are acting on the system, and that the charge carriers in the solenoid have 
zero mass and thus do not contribute to the angular momentum. Does the system rotate when the current in 
the solenoid stops? 

Initially the system is stationary with zero mechanical angu- 
lar momentum. Faraday’s Law states that, when the magnetic 
field dissipates from Bo to zero, there will be a torque N acting 
on the circumferential charge q at radius R due to the change 
in magnetic flux $. 


SUPERCONDUCTING 
COIL 


d® 
N(t) = —qR— 
(t) = —aR=, 
Since a < 0, this torque leads to an angular impulse which 


will equal the final mechanical angular momentum. 


Uniform surface 


LP =T= [Noa = qR® charge q 
t 
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The initial angular momentum in the electromagnetic field can be derived using equation 7.6, plus Stoke’s 
theorem (Appendix H3). Equation 2.142 gives that the final angular momentum equals the angular impulse 


LEM a =R / f rpgdldt = R f rpgdl = qR f Agdl = qR | B - dS =qR® 
t 


where ® = f Asa = fs -dS is the initial total magnetic flux through the solenoid. Thus the total initial 


angular momentum is given by 
A 

Dinitial. = 0 + Linttia = (R 

Since the final electromagnetic field is zero the final total angular momentum is given by 
TOTAL C 

Linai = Lina a +0= qk® 
Note that the total angular momentum is conserved. That is, initially all the angular momentum is stored in 
the electromagnetic field, whereas the final angular momentum is all mechanical. This explains the paradox 
that the mechanical angular momentum is not conserved, only the total angular momentum of the system is 
conserved, that is, the sum of the mechanical and electromagnetic angular momenta. 


7.3 Invariant transformations and Noether’s Theorem 


One of the great advantages of Lagrangian mechanics is the freedom it allows in choice of generalized 
coordinates which can simplify derivation of the equations of motion. For example, for any set of coordinates, 
qj, a reversible point transformation can define another set of coordinates qj such that 


G = (M1, 92, dn; t) (7.7) 


The new set of generalized coordinates satisfies Lagrange’s equations of motion with the new Lagrangian 
L(d”,d',t) = L(a, å, t) (7.8) 


The Lagrangian is a scalar, with units of energy, which does not change if the coordinate representa- 
tion is changed. Thus L(q',ġ',t) can be derived from L(q,ġ,t) by substituting the inverse relation q; = 
qildi, q2, --qh;t) into L(q,q,t). That is, the value of the Lagrangian L is independent of which coordinate 
representation is used. Although the general form of Lagrange’s equations of motion is preserved in any 
point transformation, the explicit equations of motion for the new variables usually look different from those 
with the old variables. A typical example is the transformation from cartesian to spherical coordinates. For 
a given system, there can be particular transformations for which the explicit equations of motion are the 
same for both the old and new variables. Transformations for which the equations of motion are invariant, 
are called invariant transformations. It will be shown that if the Lagrangian does not explicitly contain 
a particular coordinate of displacement qi, then the corresponding conjugate momentum, p;, is conserved. 
This relation is called Noether’s theorem which states “For each symmetry of the Lagrangian, there is a 
conserved quantity”. 

Noether’s Theorem will be used to consider invariant transformations for two dependent variables, x(t), 
and 0(t), plus their conjugate momenta p, and pg. For a closed system, these provide up to six possible 
conservation laws for the three axes. Then we will discuss the independent variable t, and its relation to 
the Generalized Energy Theorem, which provides another possible conservation law. For simplicity, these 
discussions will assume that the systems are holonomic and conservative. 

The Lagrange equations using generalized coordinates for holonomic systems, was given by equation 6.60 


to be 
d ( ar OL ) S 09% EXC 
== )=57?=)2_45 (qt) + QF (7.9) 
{ dt 04; 09; 2 09; J 
This can be written in terms of the generalized momentum as 
d OL ~ 09% EXC 
e a NN pale e dl 
fin- Z, a+ (7.10) 
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or equivalently as 
: OL z 09% EXC 
=; + ) Ak —(q,t) + Q; 7.11 
Pj ðq; » k dq; (q ) Qj ( ) 


Note that if the Lagrangian L does not contain q; explicitly, that is, the Lagrangian is invariant to a linear 
translation, or equivalently, is spatially homogeneous, and if the Lagrange multiplier constraint force and 
generalized force terms are zero, then 


OL A, 09% EXC 
— + |) (9,1) +03 =0 7.12 
ðg |E "dq; ie ee 
In this case the Lagrange equation reduces to 
. _ ap; 
i= ae =0 (7.13) 


Equation 7.13 corresponds to p; being a constant of motion. Stated in words, the generalized momentum pi 
is a constant of motion if the Lagrangian is invariant to a spatial translation of q;, and the constraint plus 
generalized force terms are zero. Expressed another way, if the Lagrangian does not contain a given coordi- 
nate q; and the corresponding constraint plus generalized forces are zero, then the generalized momentum 
associated with this coordinate is conserved. Note that this example of Noether’s theorem applies to any 
component of q. For example, in the uniform gravitational field at the surface of the earth, the Lagrangian 
does not depend on the x and y coordinates in the horizontal plane, thus py and py are conserved, whereas, 
due to the gravitational force, the Lagrangian does depend on the vertical z axis and thus p, is not conserved. 


7.2 Example: Atwoods machine 


Assume that the linear momentum is conserved for the Atwood’s machine shown in the figure below. Let 
the left mass rise a distance x and the right mass rise a distance y. Then the middle mass must drop by 
x+y to conserve the length of the string. The Lagrangian of the system is 


1 1 1 7 
L= ¿(mis (3m)(—t pH mr (4mgx + 3mg[—1 — y) + mgy) = git? +3may+2my’—mg(x—2y) 


Note that the transformation 
gL = Lor 2€ 


Y = Yore 


results in the potential energy term mg(ax—2y) = mg(xo—2yo) 
which is a constant of motion. As a result the Lagrangian 
is independent of e, which means that it is invariant to the 
dL y 


small perturbation e, and thus 4+ = 0. Therefore, accord- 


ing to Noether’s theorem, the corresponding linear momen- 


b 
dL 


4m 3m m 
tum P. = Ge is conserved. This conserved linear momentum Example of an Atwood’s machine 
then is given by 
dL OLO« ER OL Oy — 
de 00 OY OE | 


P.= m(7á + 3y)(2) + m(34 + 4y) = m(174 + 10%) 


Thus, if the system starts at rest with P, =0, then + always equals -iy since P, is constant. 
Note that this also can be shown using the Euler-Lagrange equations in that Az L = 0 and Ay L = 0 give 


7mi+3my = -mg 


3mxz+4my = mg 
Adding the second equation to twice the first gives 


<1 + 10my) =0 


This is the result obtained directly using Noether’s theorem. 


17mz + 10my = 
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7.4 Rotational invariance and conservation of angular momentum 


The arguments, used above, apply equally well to conjugate momenta pg and @ for rotation about any axis. 
The Lagrange equation is 


{5-5 wa} = plat t) + Qpxo (7.14) 


If no constraint or generalized torques act on the system, then the right-hand side of equation 7.14 is zero. 
Moreover if the Lagrangian in not an explicit function of 0, then oe = 0, and assuming that the constraint 
plus generalized torques are zero, then pg is a constant of motion. 

Noether’s Theorem illustrates this general result which can be stated as, if the Lagrangian is rotationally 
invariant about some axis, then the component of the angular momentum along that axis is conserved. Also 
this is true for the more general case where the Lagrangian is invariant to rotation about any axis, which 
leads to conservation of the total angular momentum. 


7.3 Example: Conservation of angular momentum for rotational invariance: 


The Noether theorem result for rotational-invariance about an 
axis also can be derived using cartesian coordinates as shown below. 
As discussed in appendix D, it is necessary to limit discussion of 
rotation to infinitessimal rotation angles in order to represent the 39 
rotation by a vector. Consider an infinitessimal rotation 60 about NS 
some axis, which is a vector. As illustrated in the adjacent figure, 
this can be expressed as 50 


ór=00 xr ôr 


The velocity vectors also change on rotation of the system obeying 

the transformation equation which is common to all vectors, that 

is, r+ôr 
or = 80 xt 


If the Lagrangian is unaffected by the orientation of the system, 
that is, it is rotationally invariant, then it can be shown that the 
angular momentum is conserved. For example, consider that the 
Lagrangian is invariant to rotation about some axis q;. Since the Infinitessimal rotation 
Lagrangian is a function 

then the expression that the Lagrangian does not change due to an infinitesimal rotation 90 about this axis 
can be expressed as 


OL OL 
— . — % . 
ôL = : ome + > om 0 (A) 


where cartesian coordinates have been used. 
Using the generalized momentum 


OL 
da; Y 
then, Lagrange’s equation gives 
d OL — 
a Ou: 
that is 
. OL 
Pi = de 
Inserting this into equation A gives 
3 3 


i 
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This is equivalent to the scalar products 
p:ór+p:ór=0 
For an infinitessimal rotation 00,then dr =090 x r and dr =00 x t . Therefore 


D- (60 xr)+p- (50x t) =0 


The cyclic order can be permuted giving 


90 -(rxp)+00-(rxp) = 0 
90 -(rxp+(GExp) = 0 
d0 - £ (rxp) = 0 
Because the infinitessimal angle 00 is arbitrary, then the time derivative 
d 
dE (rxp)=0 


about the axis of rotation 90. But the bracket (r x p) equals the angular momentum. That is; 
Angular momentum = (r x p) = constant 


This proves the Noether’ theorem that the angular momentum about any axis is conserved if the Lagrangian 
is rotationally invariant about that axis. 


7.4 Example: Diatomic molecules and axially-symmetric nuclei 


An interesting example of Noether’s theorem applies to diatomic molecules such as Ha, Na, Fa, O2, Cle 
and Bra. The electric field produced by the two charged nuclei of the diatomic molecule has cylindrical 
symmetry about the axis through the two nuclei. Electrons are bound to this dumbbell arrangement of the two 
nuclear charges which may be rotating and vibrating in free space. Assuming that there are no external torques 
acting on the diatomic molecule in free space, then the angular momentum about any fixed axis in free space 
must be conserved according to Noether’s theorem. If no external torques are applied, then the component of 
the angular momentum about any fixed axis is conserved, that is, the total angular momentum is conserved. 
What is especially interesting is that since the electrostatic potential, and thus the Lagrangian, of the diatomic 
molecule has cylindrical symmetry, that is ch = 0, then the component of the angular momentum with respect 
to this symmetry axis also is conserved irrespective of how the diatomic molecule rotates or vibrates in free 
space. That is, an additional symmetry has been identified that leads to an additional conservation law that 
applies to the angular momentum. 

An example of Noether’s theorem is in nuclear physics where some nuclei have a spheroidal shape similar 
to an american football or a rugby ball. This spheroidal shape has an axis of symmetry along the long axis. 
The Lagrangian is rotationally invariant about the symmetry axis resulting in the angular momentum about 
the symmetry axis being conserved in addition to conservation of the total angular momentum. 


7.5 Cyclic coordinates 


Translational and rotational invariance occurs when a system has a cyclic coordinate qx. A cyclic coordinate 
is one that does not explicitly appear in the Lagrangian. The term cyclic is a natural name when one has 
cylindrical or spherical symmetry. In Hamiltonian mechanics a cyclic coordinate often is called an ignorable 
coordinate. By virtue of Lagrange’s equations 


d OL OL 
—a~— —- ~— =0 7.15 
dt Oy O ee) 
then a cyclic coordinate qk, is one for which se = 0. Thus 
d OL 
A =O 7.16 
og. T” (7.16) 


that is, px is a constant of motion if the conjugate coordinate qk is cyclic. This is just Noether’s Theorem. 
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7.6 Kinetic energy in generalized coordinates 
Application of Noether’s theorem to the conservation of energy requires the kinetic energy to be expressed 


in generalized coordinates. In terms of fixed rectangular coordinates, the kinetic energy for N bodies, each 
having three degrees of freedom, is expressed as 


1 N 3 a 
T=; 5 2 Mot? i (7.17) 


These can be expressed in terms of generalized coordinates as Za; = Za,i(qj, t) and in terms of generalized 
velocities 


OLa,i . OX, 
tas =Y Bay + Ta (7.18) 


Taking the square of ta, and inserting into the kinetic energy relation gives 


OLaji Lai. rai aig, tai Y 
T(q,4,t DDES aa 6 - “Day tte +22 ma a Dm) (7.19) 


a i,j,k 


This can be abbreviated as 


T(q, d, t) = To (q, d, t) T Tı (q, d, t) oi To(q, t) (7.20) 
where 
: Po A 
Raat = >> san Pa, Og, Ua = Yo ajrásd (7.21) 
a i,j,k J j,k 
ʻ Ola i OXe,i A p 
Ti(q,4,t) = $ Ma Dg; 3 Ë => did (7.22) 
a ij j,k 
1 Ola: 3 
Tat) = NY ¿ma ( va, ) (7.23) 
where o S S 1 TAE a 
ne agers © ðq O . 


When the transformed system is scleronomic, time does not appear explicitly in the transformation 


equations to generalized coordinates since Ma iż = 0. Then 7, = To = 0, and the kinetic energy reduces to 


a homogeneous quadratic function of the generalized velocities 
T(q, å, t) = To(a, 4, t) (7.25) 
A useful relation can be derived by taking the differential of equation 7.21 with respect to q@. That is 
OT (q, q, t 
at: = 2 onda + Day (7.26) 


Multiply this by q and sum over | oa 
OT2(q, Å, 
> ai aa) =e QikdkG + > ayiqiq = 2 3 Qkdkeg = 212 
1 
i 


Similarly, the products of the generalized velocities q, with the Suse ilies derivatives of T} and To give 


Nao? = 2 (7.27) 
7 q 

OT, 
yy ee = Ti(q,4,t) (7.28) 
7 On 
yy = 0 (7.29) 
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Equation 7.25 gives that T = Tə when the transformed system is scleronomic, i.e. Pra = 0, and then the 
kinetic energy is a quadratic function of the generalized velocities gj. Using the definition of the generalized 
momentum equation 7.3, assuming T = T>, and that the potential U is velocity independent, gives that 

OL ƏT OU _ OT: 
n == =e eS (7.30) 
Ou Ou 0 0 


Then equation 7.27 reduces to the useful relation that 


1 iL 
T: — 7 = —q . . 1 
2 5) 3 0199) 54 p (7.3 ) 


where, for compactness, the summation is abbreviated as a scalar product. 


7.7 Generalized energy and the Hamiltonian function 


Consider the time derivative of the Lagrangian, plus the fact that time is the independent variable in the 
Lagrangian. Then the total time derivative is 


Dy. aad + Dz Ds dE (7.32) 


The Lagrange equations for a conservative force are given by equation 6.60 to be 


d aL a Bee Dr 
aag =Q; Ps (q,t) (7.33) 


The holonomic constraints can be accounted for using the Lagrange multiplier terms while the generalized 
force gr C includes non-holonomic forces or other forces not included in the potential energy term of the 
Lagrangian, or holonomic forces not accounted for by the Lagrange multiplier terms. 

Substituting equation 7.33 into equation 7.32 gives 


dL _d OL . =, 09% 
a 7 D e opre +y ntan 
J dy mi OW 


= a , OL ES y | QEXC ~ gk 


j 


oL.. OL 
pa + ae 


aL 
— 34 
0 +s (7.34) 


This can be written in the form 
d . OL EXC 99 (q OL 
a 2 (ax e) -1 = Lat o + A e (7.35) 
Define Jacobi’s Generalized Energy! h(q, å, t) by 
: . OL ; 
h(q,4,t) =) (ù 72) — L(q, 4, t) (7.36) 
j J 


Jacobi’s generalized momentum, equation 7.3, can be used to express the generalized energy h(q,q,t) in 
terms of the canonical coordinates q; and p;, plus time t. Define the Hamiltonian function to equal the 
generalized energy expressed in terms of the conjugate variables (q;,p;), that is, 


H (apt) =h(q, 4,1) =$, (4,7) — L(q,4,t) = > (p) — L(a 4, t) (7.37) 


J 
This Hamiltonian H (q, p,t) underlies Hamiltonian mechanics which plays a profoundly important role in 
most branches of physics as illustrated in chapters 8,15 and 18. 


1 Most textbooks call the function h(q, å, t) Jacobi’s energy integral. This book adopts the more descriptive name Generalized 
energy in analogy with use of generalized coordinates q and generalized momentum p. 
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7.8 Generalized energy theorem 


The Hamilton function, 7.37 plus equation 7.35 lead to the generalized energy theorem 


j 
Note that for the special case where all the external forces [apre EN m1 M5 (a, +)] = 0, then 


oe = SPL (7.39) 
dt Ot 

Thus the Hamiltonian is time independent if both [ares +a Mee (q, o] = 0 and the Lagrangian are 
time-independent. For an isolated closed system having no external forces acting, then the Lagrangian is 
time independent because the velocities are constant, and there is no external potential energy. That is, the 
Lagrangian is time-independent, and 


d aL dH aL 
= A eee A 
di 2 (4, 7) di e ey) 


As a consequence, the Hamiltonian H (q, p,t) , and generalized energy h(q, Å, t), both are constants of motion 
if the Lagrangian is a constant of motion, and if the external non-potential forces are zero. This is an example 
of Noether’s theorem, where the symmetry of time independence leads to conservation of the conjugate 
variable, which is the Hamiltonian or Generalized energy. 


7.9 Generalized energy and total energy 
The generalized kinetic energy, equation 7.20, can be used to write the generalized Lagrangian as 

L(q, 4, t) = T2(q, å, t) + Ti(a, 4, t) + Tola, t) — U (q, t) (7.41) 
If the potential energy U does not depend explicitly on velocities q, or time, then 


ƏL Əð(T-U) OF 
ee - == 7.42 
Pj 0d; 0d; 0d; ( ) 


Equation 7.42 can be used to write the Hamiltonian, equation 7.37, as 


H (a, pt) = >> la) + 2 (577) ips (572) — L(q,4,t) (7.43) 


Using equations 7.27, 7.28, 7.29 gives that the total generalized Hamiltonian H (q, p,t) equals 
H (q,p,t) = 2h +T — (Ta +T + Ty) -U) = To — To +U (7.44) 


But the sum of the kinetic and potential energies equals the total energy. Thus equation 7.44 can be rewritten 
in the form 
H (q,p,t) = (T + U) — (Tı +2T0) = E — (Tı + 2T0) (7.45) 


Note that Jacobi’s generalized energy and the Hamiltonian do not equal the total energy E. However, in 
the special case where the transformation is scleronomic, then Tı = Tọ = 0, and if the potential energy U 
does not depend explicitly of q;, then the generalized energy (Hamiltonian) equals the total energy, that is, 
H = E. Recognition of the relation between the Hamiltonian and the total energy facilitates determining 
the equations of motion. 
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7.10 Hamiltonian invariance 


Chapters 7.8, 7.9 addressed two important and independent features of the Hamiltonian regarding: a) when 
H is conserved, and b) when H equals the total mechanical energy. These important results are summarized 
below with a discussion of the assumptions made in deriving the Hamiltonian, as well as the implications. 


a) Conservation of generalized energy 


The generalized energy theorem (7.38) was given as 


dt dt 09; ot 


dH t)  dh(q,4,t =.0 OL(q, q, t 
(q, Pp, ) = (q, q, ) = y di ep $ 5 (a, t) _ (q, q, ) (7.46) 
J k=1 


Note that when )>, 4; lane +r dn GE (q, 2) = 0, then equation 7.46 reduces to 


dH OL 

— ==> 7.47 

dt Ot ( ) 
Also, when 5°, dj lanes +r Mala, +)] = 0, and if the Lagrangian is not an explicit function of time, 
then the Hamiltonian is a constant of motion. That is, H is conserved if, and only if, the Lagrangian, and 
consequently the Hamiltonian, are not explicit functions of time, and if the external forces are zero. 


b) The generalized energy and total energy 


If the following two requirements are satisfied 
1) The kinetic energy has a homogeneous quadratic dependence on the generalized velocities, that is, the 
transformation to generalized coordinates is independent of time, Presa =0. 


2) The potential energy is not velocity dependent, thus the terms 27 =0. 


Then equation 7.45 implies that the Hamiltonian equals the total mechanical energy, that is, 


H=T+U=E (7.48) 


Expressed in words, the generalized energy (Hamiltonian) equals the total energy if the constraints are 
time independent and the potential energy is velocity independent. This is equivalent to stating that, if the 
constraints, or generalized coordinates, for the system are time independent, then H = E. 

The four combinations of the above two independent conditions, assuming that the external forces term 
in equation 7.46 is zero, are summarized in table 7.1. 


Table 7.1: Hamiltonian and total energy 
Hamiltonian Constraints and coordinate transformation 
Time behavior Time independent Time dependent 


H H conserved, H = E H conserved, H # E 


aH = H not conserved, H = E | H not conserved, H 4 E 


Note the following general facts regarding the Lagrangian and the Hamiltonian. 

(1) the Lagrangian is indefinite with respect to addition of a constant to the scalar potential, 

(2) the Lagrangian is indefinite with respect to addition of a constant velocity, 

(3) there is no unique choice of generalized coordinates. 

(4) the Hamiltonian is a scalar function that is derived from the Lagrangian scalar function. 

(5) the generalized momentum is derived from the Lagrangian. 

These facts, plus the ability to recognize the conditions under which H is conserved, and when H = EF, 
can greatly facilitate solving problems as shown by the following two examples. 
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7.5 Example: Linear harmonic oscillator on a cart moving at constant velocity 


Consider a linear harmonic oscillator located on a cart that 
is moving with constant velocity vo in the x direction, as shown 
in the adjacent figure. Let the laboratory frame be the unprimed 
frame, and the cart frame be designated the primed frame. As- 
sume that x= x' at t=0. Then 


xl =x-—ovot p =£-— v9 e =H 
The harmonic oscillator will have a potential energy of 
1 1 
U = she” = 5k (x — vot)” 
Laboratory frame: The Lagrangian is 
2 


; mi 1 
L(x,t,t) = — =k (x — ot) 
2 2 
Lagrange equation Az L = 0 gives the equation of motion to be Harmonic oscillator on cart moving at 


mé = —k(x — vot) uniform velocity vo. 


The definition of generalized momentum gives 
n 


p mī 


The Hamiltonian is 


_ OL p? 1 2 
H(a,p,t) = ¡== — L= — + =k (x — vt 
( ae) ) 26, 2m 5k ( vot) 


The Hamiltonian is the sum of the kinetic and potential energies and equals the total energy of the system, 
but it is not conserved since L and H are both explicit functions of time, that is ai oH oh 0. 
Physically this is understood in that energy must flow into and out of the external constraint keeping the cart 
moving uniformly at a constant velocity vo against the reaction to the oscillating mass. That is, assuming 
a uniform velocity for the moving cart constitutes a time-dependent constraint on the mass, and the force of 
constraint does work in actual displacement of the complete system. If the constraint did not exist, then the 
cart momentum would oscillate such that the total momentum of cart plus spring system is conserved. 
Cart frame: Transform the Lagrangian to the primed coordinates in the moving frame of reference, 
which also is an inertial frame. Then the Lagrangian L, in terms of the moving cart frame coordinates, is 


1 
L(a, t,t) == (87 + 28'v9 + v2) — she” 
The Lagrange equation of motion A, L = 0 gives the equation of motion to be 
mi’ = —kx' 


where x' is the displacement of the mass with respect to the cart. This implies that an observer on the 
cart will observe simple harmonic motion as is to be expected from the principle of equivalence in Galilean 
relativity. 

The definition of the generalized momentum gives the linear momentum in the primed frame coordinates 


to be 
OL 


04 


The cart-frame Hamiltonian also can be expressed in terms of the coordinates in the moving frame to be 


roy OL (p — mvo)” ma 
H(x',p,t)=# > L= m 7% 
Note that the Lagrangian and Hamiltonian expressed in terms of the coordinates in the cart frame of reference 
are not explicitly time dependent, therefore H is conserved. However, the cart-frame Hamiltonian does not 
equal the total energy since the coordinate transformation is time dependent. Actually the first two terms in 
the above Hamiltonian are the energy of the harmonic oscillator in the cart frame. This example shows that 
the Hamiltonians differ when expressed in terms of either the laboratory or cart frames of reference 


1 
so yea” 
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7.6 Example: Isotropic central force in a rotating frame 


Consider a mass subject to a central isotropic radial 
force U(r) as shown in the adjacent figure. Compare 
the Hamiltonian H in the fixed frame of reference S, Z 
with the Hamiltonian H' in a frame of reference S' that 
is rotating about the center of the force with constant 
angular velocity w. Restrict this case to rotation about 
one axis so that only two polar coordinates r and p need 
to be considered. The transformations are 


r =r y 
i = -wt r 
104 
Also m 
U(r) = U(r’) x 


Fixed frame of reference S: Mass subject to radial force 


L=T-U =% (P +96’) -U(r) 


Since the Lagrangian is not explicitly time dependent, then the Hamiltonian is conserved. For this fixed-frame 
Hamiltonian the generalized momenta are 


Po. = de E mid 
OL : 
r = Br = mer 
p Or 
The Hamiltonian equals 
. OL 1 Py? 


The Hamiltonian in the fixed frame is conserved and equals the total energy, that is H =T +U. 

Rotating frame of reference S’ 

The above inertial fixed-frame Lagrangian can be written in terms of the primed (non-inertial rotating 
frame) coordinates as 


p= eat > G +176") -U(r)= + G +r’? (9 +u)) - U(r’) 


The generalized momenta derived from this Lagrangian are 


OL . 

Py = —h=mp? (9 + w) = Py +mr’?w 
0 
OL ; 

Pr or mi? Pr 


The Hamiltonian expressed in terms of the non-inertial rotating frame coordinates is 


H'op, e) = Le + i oe 


i 2 
24 (py + mr w) 
Or’ 2m 


T r2 


+U(r’) 


Note that H' (p, pr'o) is time independent and therefore is conserved, but H (pr, Pp T, ¢') 4 E because 
the generalized coordinates are time dependent. In addition, Py is conserved since 


y 2H ab 
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7.7 Example: The plane pendulum 


The simple plane pendulum in a uniform gravita- 
tional field g is an example that illustrates Hamiltonian 
invariance. There is only one generalized coordinate, 0 


and the Lagrangian for this system is 
1 2 0 
L= ¿mio + mgl cos 0 
g e 
The momentum conjugate to 0 is | 
OL 2; 
Pp 3b m 
which is the angular momentum about the pivot point. The plane pendulum constrained to oscillate in a 
Using the Lagrange-Euler equation this gives that vertical plane in a uniform gravitational field. 


N. e lsin 
ayes = PO = Be = mgt si 


Note that the angular momentum pg is not a constant of motion since it explicitly depends on 0. 
The Hamiltonian is 


p3 
2ml? 


r 1 . 
H =X pidi — L = pob — L = 3m? — mgl cos 6 = — mgl cos 0 


Note that the Lagrangian and Hamiltonian are not explicit functions of time, therefore they are conserved. 
Also the potential is velocity independent and there is no coordinate transformation, thus the Hamiltonian 
equals the total energy E, which is a constant of motion. 

P3 


H = 2m/2 — mgl cos 0 = E 


7.8 Example: Oscillating cylinder in a cylindrical bowl 


It is important to correctly account for constraint forces when us- 
ing Noether’s theorem for constrained systems. Noether’s theorem as- 
sumes the variables are independent. This is illustrated by considering 
the example of a solid cylinder rolling in a fixed cylindrical bowl. As- 
sume that a uniform cylinder of radius p and mass m is constrained 
to roll without slipping on the inner surface of the lower half of a hol- 
low cylinder of radius R. The motion is constrained to ensure that 
the axes of both cylinders remain parallel and p < R. 

The generalized coordinates are taken to be the angles 0 and q 
which are measured with respect to a fixed vertical axis. Then the 
kinetic energy and potential energy are 


T= 5m[(R-p)d) +518 er eer 


where m is the mass of the small cylinder and where U = 0 at the lowest position of the sphere. The moment 
of inertia of a uniform cylinder is I = imp’. 
The Lagrangian is 
1 


L-T U=zm|(R pèl + 


Tmo? — [R (R— p) cos6] mg 


Since the solid cylinder rotates without slipping inside the cylindrical shell, then the equation of constraint is 


g(0¢) = RO—p(9+0) =0 
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Using the Lagrangian, plus the one equation of constraint, requires one Lagrange multiplier. Then the 
Lagrange equations of motion for 0 and @ are 


OL d{oL] | -0 
00 dt | a6 go. 
OL d [ƏL 
—— = SS | SS | ce 2 = 0 
0p dt Lag 

Substitute the Lagrangian and the equation of constraint gives two equations of motion 


—(R—-p)mgsin@ —m(R- p} 8+A(R-p) 0 


1 is 
mp6 Ap = 0 


The lower equation of motion gives that 
ee 
= —-m 
¿no 


Substitute this into the equation of constraint gives 


1 
A= ¿mM (R-p)0 
Substitute this into the first equation of motion gives the equation of motion for 0 to be 


> 2g 
0 = — sinl 

3(R- p) 
that is 


à= -2 sin 9 
The torque acting on the small cylinder due to the frictional force is 
1 E 
Fp = mp9 =-Ap 


Thus the frictional force is 
F=-)= TE sinó 


Noether’s theorem can be used to ascertain if the angular momentum pg is a constant of motion. The 


derivative of the Lagrangian 
a = (R- p)mgsin#d 

and thus the Lagrange equations tells us that pa = (R — p) mgsin 0. Therefore pa is not a constant of motion. 

The Lagrangian is not an explicit function of ¢, which would suggest that pg is a constant of motion. 
But this is incorrect because the constraint equation p = E-e) g couples O and œ, that is, they are not 
independent variables, and thus pọ and pẹ are coupled by the constraint equation. As a result pg is not a 
constant of motion because it is directly coupled to pọ = (R — p) mgsin which is not a constant of motion. 
Thus neither pg nor pe are constants of motion. This illustrates that one must account carefully for equations 
of constraint, and the concomitant constraint forces, when applying Noether’s theorem which tacitly assumes 
independent variables. 

The Hamiltonian can be derived using the generalized momenta 


OL 


= — = Mm R = 2 0 

Po Ff ( p) 

OL 1 4: 

=$ — = -m 

Po ab 2 po 

Then the Hamiltonian is given by 
y . p2 p? 
H=p00+ps0— L = : 2 +[R— (R— p) cos 0] mg 


zt 23 
2m(R-— p) mp 
Note that the transformation to generalized coordinates is time independent and the potential is not velocity 


dependent, thus the Hamiltonian also equals the total energy. Also the Hamiltonian is conserved since 
dH 

E =0. 

dt 
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7.11 Hamiltonian for cyclic coordinates 


It is interesting to discuss the properties of the Hamiltonian for cyclic coordinates qx for which fe = 0. 
Ignoring the external and Lagrange multiplier terms, 


ee ree (7.49) 


That is, a cyclic coordinate has a constant corresponding momentum pz for the Hamiltonian as well as 
for the Lagrangian. Conversely, if a generalized coordinate does not occur in the Hamiltonian, then the 
corresponding generalized momentum is conserved. Cyclic coordinates were discussed earlier when discussing 
symmetries and conservation-law aspects of the Lagrangian. For example, if the Lagrangian, or Hamiltonian 
do not depend on a linear coordinate x, then py is conserved. Similarly for 0 and pg. An extension of this 
principle has been derived for the relationship between time independence and total energy of a system, 
that is, the Hamiltonian equals the total energy if the transformation to generalized coordinates is time 
independent and the potential is velocity independent. 

A valuable feature of the Hamiltonian formulation is that it allows elimination of cyclic variables which 
reduces the number of degrees of freedom to be handled. As a consequence, cyclic variables are called 
ignorable variables in Hamiltonian mechanics. For example, consider that the Lagrangian has one cyclic 
variable qn. As a consequence, the Lagrangian does not depend on qn, and thus it can be written as 
L = L(Q1,...,dn-1;041)-->Gn;t). The Lagrangian still contains n generalized velocities, thus one still has to 
treat n degrees of freedom even though one degree of freedom q, is cyclic. However, in the Hamiltonian 
formulation, only n — 1 degrees of freedom are required since the momentum for the cyclic degree of freedom 
is a constant pn = a. Thus the Hamiltonian can be written as H = H(q1,...,dn—13 P1, -++-) Pn—1; Q; t) , that is, 
the Hamiltonian includes only n—1 degrees of freedom. Thus the dimension of the problem has been reduced 
by one since the conjugate cyclic (ignorable) variables (qn, pn) are eliminated. Hamiltonian mechanics can 
significantly reduce the dimension of the problem when the system involves several cyclic variables. This is 
in contrast to the situation for the Lagrangian approach as discussed in chapters 8 and 15. 


7.12 Symmetries and invariance 


This chapter has shown that the symmetries of a system lead to invariance of physical quantities as was pro- 
posed by Noether. The symmetry properties of the Lagrangian can lead to the conservation laws summarized 
in table 7.2. 


Table 7.2: Symmetries and conservation laws in classical mechanics 
Symmetry Lagrange property Conserved quantity 
Spatial invariance Translational invariance | Linear momentum 
Spatial homogeneous | Rotational invariance | Angular momentum 
Time invariance Time independence Total energy 


The importance of the relations between invariance and symmetry cannot be overemphasized. It extends 
beyond classical mechanics to quantum physics and field theory. For a three-dimensional closed system, 
there are three possible constants for linear momentum, three for angular momentum, and one for energy. It 
is especially interesting in that these, and only these, seven integrals have the property that they are additive 
for the particles comprising a system, and this occurs independent of whether there is an interaction among 
the particles. That is, this behavior is obeyed by the whole assemble of particles for finite systems. Because 
of its profound importance to physics, these relations between symmetry and invariance are used extensively. 


7.13 Hamiltonian in classical mechanics 


The Hamiltonian was defined by equation 7.37 during the discussion of time invariance and energy conserva- 
tion. The Hamiltonian is of much more profound importance to physics than implied by the ad hoc definition 
given by equation 7.37. This relates to the fact that the Hamiltonian is written in terms of the fundamental 
coordinate q; and its generalized momentum p; defined by equation 7.3. 
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It is more convenient to write the n generalized coordinates qi, plus their generalized momentum p;, as 
vectors, e.g. q = (q1,42,..qn), P = (p1,P2,--Pn). The generalized momenta conjugate to the coordinate q;, 
defined by 7.3, then can be written in the form 


OL(q, å, t) 


T (7.50) 


Pi = 
Substituting this definition of the generalized momentum into the Hamiltonian defined in (7.37), and 
expressing it in terms of the coordinate q and its conjugate generalized momenta p, leads to 


H(q,p,t) = Y Pid: — L(q, 4, t) (7.51) 


= p-q-L(q,4,?) (7.52) 


Note that the scalar product p-q =) >, pig; equals 2T for systems that are scleronomic and when the 
potential is velocity independent. 

The crucial feature of the Hamiltonian is that it is expressed as H (q,p,t), that is, it is a function 
of the n generalized coordinates q and their conjugate momenta p, which are taken to be independent, in 
addition to the independent variable, t. This is in contrast to the Lagrangian L(q, å, t) which is a function 
of the n generalized coordinates qj, the corresponding velocities qj, and time t. The velocities q are the 
time derivatives of the coordinates q and thus these are related. In physics, the fundamental conjugate 
coordinates are (q, p), which are the coordinates underlying the Hamiltonian. This is in contrast to (q, q) 
which are the coordinates that underlie the Lagrangian. Thus the Hamiltonian is more fundamental than 
the Lagrangian and is a reason why the Hamiltonian mechanics, rather than the Lagrangian mechanics, was 
used as the foundation for development of quantum and statistical mechanics. 

Hamiltonian mechanics will be derived two other ways. Chapter 8 uses the Legendre transformation 
between the conjugate variables (q,q,¢) and (q,p,t) where the generalized coordinate q and its conjugate 
generalized momentum, p are independent. This shows that Hamiltonian mechanics is based on the same 
variational principles as those used to derive Lagrangian mechanics. Chapter 9 derives Hamiltonian mechan- 
ics directly from Hamilton’s Principle of Least action. Chapter 8 will introduce the algebraic Hamiltonian 
mechanics, that is based on the Hamiltonian. The powerful capabilities provided by Hamiltonian mechanics 
will be described in chapter 15. 


7.14 Summary 


This chapter has explored the importance of symmetries and invariance in Lagrangian mechanics and has 
introduced the Hamiltonian. The following summarizes the important conclusions derived in this chapter. 

Noether’s theorem: 

Noether’s theorem explores the remarkable connection between symmetry, plus the invariance of a sys- 
tem under transformation, and related conservation laws which imply the existence of important physical 
principles, and constants of motion. Transformations where the equations of motion are invariant are called 
invariant transformations. Variables that are invariant to a transformation are called cyclic variables. It 
was shown that if the Lagrangian does not explicitly contain a particular coordinate of displacement, q; then 
the corresponding conjugate momentum, p; is conserved. This is Noether’s theorem which states “For each 
symmetry of the Lagrangian, there is a conserved quantity”. In particular it was shown that translational 
invariance in a given direction leads to the conservation of linear momentum in that direction, and rotational 
invariance about an axis leads to conservation of angular momentum about that axis. These are the first- 
order spatial and angular integrals of the equations of motion. Noether’s theorem also relates the properties 
of the Hamiltonian to time invariance of the Lagrangian, namely; 

(1) H is conserved if, and only if, the Lagrangian, and consequently the Hamiltonian, are not explicit 
functions of time. 

(2) The Hamiltonian gives the total energy if the constraints and coordinate transformations are time 
independent and the potential energy is velocity independent. This is equivalent to stating that H = E if the 
constraints, or generalized coordinates, for the system are time independent. 

Noether’s theorem is of importance since it underlies the relation between symmetries, and invariance in 
all of physics; that is, its applicability extends beyond classical mechanics. 
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Generalized momentum: 
The generalized momentum associated with the coordinate q; is defined to be 


— = pj (7.3) 


where p; is also called the conjugate momentum (or canonical momentum) to qj where qj, pj are 
conjugate, or canonical, variables. Remember that the linear momentum p; is the first-order time integral 
given by equation 2.10. Note that if q; is not a spatial coordinate, then p; is not linear momentum, but is 
the conjugate momentum. For example, if q; is an angle, then p; will be angular momentum. 

Kinetic energy in generalized coordinates: 

It was shown that the kinetic energy can be expressed in terms of generalized coordinates by 


. oe wos . tE Pra y, . Ola i ? 
T(q,4,t) = `> Ím Gian + q 2 > a zai) (7.19) 
a i,j,k Q 
= Ta(q,4, e å, 05 t) (7.53) 


For scleronomic systems with a potential that is velocity independent, then the kinetic energy can be 
expressed as 


E e 
T=D =>) 40=34:P (7.31) 
l 


Generalized energy 
Jacobi's Generalized Energy h(q,ġ,t) was defined as 


haat) =D (677) -Laat (1:35 


Hamiltonian function 
The Hamiltonian H (q, p,t) was defined in terms of the generalized energy h(q, å, t) and by introducing 
the generalized momentum. That is 


H (q, pt) = h(q, 4, t a iE (q, 4,1) = p: 4-L(q, å, t) (7.37) 


Generalized energy theorem 
The equations of motion lead to the generalized energy theorem which states that the time dependence 
of the Hamiltonian is related to the time dependence of the Lagrangian. 


dH (q, p,t) ð OL(q, 4, t 
lap!) “Lia Ope + ag Hq m|- A (7.38) 


Note that if all the generalized non-potential forces are zero, then the bracket in equation 7.38 is zero, and 
if the Lagrangian is not an explicit function of time, then the Hamiltonian is a constant of motion. 

Generalized energy and total energy: 

The generalized energy, and corresponding Hamiltonian, equal the total energy if: 

1) The kinetic energy has a homogeneous quadratic dependence on the generalized velocities and the 
transformation to generalized coordinates is independent of time, Mo i =Q. 


2) The potential energy is not velocity dependent, thus the E ge =0. 
Chapter 8 will introduce Hamiltonian mechanics that is built on the Hamiltonian, and chapter 15 will 
explore applications of Hamiltonian mechanics. 
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Workshop exercises 
1. Consider a particle of mass m moving in a plane and subject to an inverse square attractive force. 


(a) Obtain the equations of motion. 
(b) Is the angular momentum about the origin conserved? 


(c) Obtain expressions for the generalized forces. 


2. Consider a Lagrangian function of the form L(q;, qi, Qi, t). Here the Lagrangian contains a time derivative 
of the generalized coordinates that is higher than the first. When working with such Lagrangians, the term 
“generalized mechanics” is used. 


(a) Consider a system with one degree of freedom. By applying the methods of the calculus of variations, 
and assuming that Hamilton’s principle holds with respect to variations which keep both q and q fixed at 
the end points, show that the corresponding Lagrange equation is 


d (AL\ d (OL E R. 
dt? \ 0g dt \ 0g ðq ` 


Such equations of motion have interesting applications in chaos theory. 


(b) Apply this result to the Lagrangian 
m k 
L=-—q4- +g. 
5) qq 5 q 


Do you recognize the equations of motion? 


3. A uniform solid cylinder of radius R and mass M rests on a horizontal plane and an identical cylinder rests 
on it touching along the top of the first cylinder with the axes of both cylinders parallel. The upper cylinder 
is given an infinitessimal displacement so that both cylinders roll without slipping in the directions shown by 
the arrows. 


(a) Find Lagrangian for this system 
(b) What are the constants of motion? 


(c) Show that as long as the cylinders remain in contact then 


jp 12g (1 — cos 0) 


R(17 + 4cos 6 — 4 cos? 0) 


4. Consider a diatomic molecule which has a symmetry axis along the line through the center of the two atoms 
comprising the molecule. Consider that this molecule is rotating about an axis perpendicular to the symmetry 
axis and that there are no external forces acting on the molecule. Use Noether’s Theorem to answer the 
following questions: 


a) Is the total angular momentum conserved? 
b) Is the projection of the total angular momentum along a space-fixed z axis conserved? 
c) Is the projection of the angular momentum along the symmetry axis of the rotating molecule conserved? 


d) Is the projection of the angular momentum perpendicular to the rotating symmetry axis conserved? 
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5. A bead of mass m slides under gravity along a smooth wire bent in the shape of a parabola 1? = az in the 
vertical (x, z) plane. 


(a) What kind (holonomic, nonholonomic, scleronomic, rheonomic) of constraint acts on m? 
(b) Set up Lagrange’s equation of motion for x with the constraint embedded. 


(c) Set up Lagrange’s equations of motion for both x and z with the constraint adjoined and a Lagrangian 
multiplier A introduced. 


(d) Show that the same equation of motion for x results from either of the methods used in part (b) or part 
(c). 
(e) Express À in terms of x and &. 


(£) What are the x and z components of the force of constraint in terms of x and 2? 


Problems 


1. Let the horizontal plane be the x — y plane. A bead of mass m is constrained to slide with speed v along a 
curve described by the function y = f(x). What force does the curve apply to the bead? (Ignore gravity) 


2. Consider the Atwoods machine shown. The masses are 4m, 5m, and 3m. Let x and y be the heights of the 
right two masses relative to their initial positions. 
a) Solve this problem using the Euler-Lagrange equations 


b) Use Noether’s theorem to find the conserved momentum. 


3. A cube of side 2b and center of mass C, is placed on a fixed horizontal cylinder of radius r and center O as 
shown in the figure. Originally the cube is placed such that C is centered above O but it can roll from side to 
side without slipping. (a) Assuming that b < r use the Lagrangian approach to to find the frequency for small 
oscillations about the top of the cylinder. For simplicity make the small angle approximation for L before using 
the Lagrange-Euler equations. (b) What will be the motion if b > r ? Note that the moment of inertia of the 
cube about the center of mass is 2mb?. 


[a— > —>| 


s% 


NJ 
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4. Two equal masses of mass m are glued to a massless hoop of radius R is free to rotate about its center in a 
vertical plane. The angle between the masses is 20, as shown. Find the frequency of oscillations. 


<T> 


5. Three massless sticks each of length 2r, and mass m with the center of mass at the center of each stick, are 
hinged at their ends as shown. The bottom end of the lower stick is hinged at the ground. They are held so 
that the lower two sticks are vertical, and the upper one is tilted at a small angle € with respect to the vertical. 
They are then released. At the instant of release what are the three equations of motion derived from the 
Lagrangian derived assuming that € is small? Use these to determine the initial angular accelerations of the 
three sticks. 


Chapter 8 


Hamiltonian mechanics 


8.1 Introduction 


The three major formulations of classical mechanics are 
1. Newtonian mechanics which is the most intuitive vector formulation used in classical mechanics. 


2. Lagrangian mechanics is a powerful algebraic formulation of classical mechanics derived using either 
d’Alembert’s Principle, or Hamilton’s Principle. The latter states ”A dynamical system follows a path 
that minimizes the time integral of the difference between the kinetic and potential energies”. 


3. Hamiltonian mechanics has a beautiful superstructure that, like Lagrangian mechanics, is built 
upon variational calculus, Hamilton’s principle, and Lagrangian mechanics. 


Hamiltonian mechanics is introduced at this juncture since it is closely interwoven with Lagrange mechan- 
ics. Hamiltonian mechanics plays a fundamental role in modern physics, but the discussion of the important 
role it plays in modern physics will be deferred until chapters 15 and 18 where applications to modern physics 
are addressed. 

The following important concepts were introduced in chapter 7: 

The generalized momentum was defined to be given by 


__ L(a, åt) 
Ogi 
Note that, as discussed in chapter 7.2, if the potential is velocity dependent, such as the Lorentz force, then 


the generalized momentum includes terms in addition to the usual mechanical momentum. 
Jacobi’s generalized energy function h(q, q,t) was introduced where 


(8.1) 


has dot) => (a) paged (8.2) 


a 


The Hamiltonian function was defined to be given by expressing the generalized energy function, 
equation 8.2, in terms of the generalized momentum. That is, the Hamiltonian H (q, p, t) is expressed as 


H (q, p, t) = Y id — L(q, 4, t) (8.3) 


The symbols q, p, designate vectors of n generalized coordinates, q = (41,92,..4n), P = (P1,P2,--Pn)- 
Equation 8.3 can be written compactly in a symmetric form using the scalar product p - 4 =>; pidi- 


H (q,p,t) + L(q,4,t) =p-4q (8.4) 


A crucial feature of Hamiltonian mechanics is that the Hamiltonian is expressed as H (q,p,t), that 
is, it is a function of the n generalized coordinates and their conjugate momenta, which are taken to be 
independent, plus the independent variable, time. This contrasts with the Lagrangian L(q, q,t) which is a 
function of the n generalized coordinates q;, and the corresponding velocities q;, that is the time derivatives 
of the coordinates q;, plus the independent variable, time. 
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8.2 Legendre Transformation between Lagrangian and Hamiltonian 
mechanics 


Hamiltonian mechanics can be derived directly from Lagrange mechanics by considering the Legendre trans- 
formation between the conjugate variables (q,q,¢) and (q,p,t). Such a derivation is of considerable im- 
portance in that it shows that Hamiltonian mechanics is based on the same variational principles as those 
used to derive Lagrangian mechanics; that is d’Alembert’s Principle and Hamilton’s Principle. The general 
problem of converting Lagrange’s equations into the Hamiltonian form hinges on the inversion of equation 
(8.1) that defines the generalized momentum p. This inversion is simplified by the fact that (8.1) is the first 
partial derivative of the Lagrangian scalar function L(q, q, t). 

As described in appendix F'4, consider transformations between two functions F(u,w) and G(v,w), 
where u and v are the active variables related by the functional form 


v = VuF(u, w) (8.5) 


and where w designates passive variables. The function V.F(u, w) is the first-order derivative, (gradient) 
of F(u, w) with respect to the components of the vector u. The Legendre transform states that the inverse 
formula can always be written as a first-order derivative 


u = V,G(v, w) (8.6) 
The function G(v, w) is related to F(u, w) by the symmetric relation 
G(v,w)+F(u,w) =u-v (8.7) 


N 
where the scalar product u: v = $j] uivi- 
Furthermore the first-order derivatives with respect to all the passive variables w; are related by 


VwF(u, w) =—-V,,G(v, w) (8.8) 


The relationship between the functions F'(u,w) and G(v,w) is symmetrical and each is said to be the 
Legendre transform of the other. 

The general Legendre transform can be used to relate the Lagrangian and Hamiltonian by identifying the 
active variables v with p, and u with q, the passive variable w with q,t, and the corresponding functions 
F(u,w) =£(q, 4,4) and G(v, w) =H (q, p,t). Thus the generalized momentum (8.1) corresponds to 


P = V,L(a, 4,t) (8.9) 


where (q,t) are the passive variables. Then the Legendre transform states that the transformed variable q 
is given by the relation 


å = V H (q, p.t) (8.10) 

Since the functions L(q,q,t) and H(q, p,t) are the Legendre transforms of each other, they satisfy the 
relation 

H (q,p,t) +L(q,4,t) =p-q (8.11) 


The function H (q, p,t), which is the Legendre transform of the Lagrangian L(q, q,t), is called the Hamil- 
tonian function and equation (8.11) is identical to our original definition of the Hamiltonian given by 
equation (8.3). The variables q and t are passive variables thus equation (8.8) gives that 


VaLl(4,4,t) = -VH (p,q, t) (8.12) 


Written in component form equation 8.12 gives the partial derivative relations 


0L(q,qt) _  0H(p,q,t) 
qi y Ogi (ara 
0L(4, qt) _ _ H(p, qt) (8.14) 


ot Ot 
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Note that equations 8.13 and 8.14 are strictly a result of the Legendre transformation. To complete the 
transformation from Lagrangian to Hamiltonian mechanics it is necessary to invoke the calculus of variations 
via the Lagrange-Euler equations. The symmetry of the Legendre transform is illustrated by equation 8.11. 

Equation 7.31 gives that the scalar product p - 4 =27>. For scleronomic systems, with velocity indepen- 
dent potentials U, the standard Lagrangian L = T—U and H = 2T -T +U =T +U. Thus, for this simple 
case, equation 8.11 reduces to an identity H + L = 2T. 


8.3 Hamilton’s equations of motion 
The explicit form of the Legendre transform 8.10 gives that the time derivative of the generalized coordinate 


qj is 
_ OH (a, pt) 


pS 8.15 
J Op; ( ) 
The Euler-Lagrange equation 6.60 is 
=== — A 8.16 
dt dq; Oq a ii (8:10) 
This gives the corresponding Hamilton equation for the time derivative of p; to be 
d ôL os n EXC 
ATR j= Ar ra Q; (8.17) 
dt aq; Pj = = yong 
Substitute equation 8.13 into equation 8.17 leads to the second Hamilton equation of motion 
0H(q,pt) wa, 0 
js ¿SB Ny Oe eo (8.18) 


09; kel 09; 


One can explore further the implications of Hamiltonian mechanics by taking the time differential of (8.3) 
giving. 


dH (a, p,t) . Ap; dq; OL dq; OL dq; OL 
aq 8.19 
dt DC oa Pid Og dt Day at at (8:18) 
Inserting the conjugate momenta p; = oe and equation 8.17 into equation 8.19 results in 
dH (q, pst) nee dij |. 35, 09% — ¿exc då; \ _ OL 
= D I e S Jp ake = -— 8.20 
dt 2 dis + Pie [Pa 2, Bag. 8 | Pie > oy (220) 
The second and fourth terms cancel as well as the ġjp; terms, leaving 
dH (q, p,t) 09, EXC|. OL 
a Zak E se |. ei 8.21 
This is the generalized energy theorem given by equation 7.38. 
The total differential of the Hamiltonian also can be written as 
dH (a, p,t) oH . ðH . OH 
ES ; 8.22 
dt : Op? + Gq; 8) * BE 22) 
Use equations 8.15 and 8.18 to substitute for 2 Bai Hand $i - in equation 8.22 gives 
dH (q, p,t) 99% , pexc|. |, 2Ha pt) 
_ sak ; ote See 2 
E 2 de 24; +Q; gt (8.23) 
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Note that equation 8.23 must equal the generalized energy theorem, i.e. equation 8.21. Therefore, 
OH OL 


In summary, Hamilton’s equations of motion are given by 
OH (q, pst) 
ġ = 8.25 
J Op; ( ) 
. _0H(9,p, ð 
poa Mia) ve + [Sag Ay E ate (8.26) 
dH(q, p,t) = 2 EXC oL(q, q,t) 
opi = (i A ci a ae (8.27) 


The symmetry of Hamilton's equations of motion is illustrated when the Lagrange multiplier and gener- 
alized forces are zero. Then 


. _ OH(q,p,t) 
q = Op; (8.28) 
OH (p,q, t) 
fee a PED: 2 
dH(p,q,t) _ OH(p,q,t) _  0L(4, qt) 
dt 7 Ot a Ot 15:30 


This simplified form illustrates the symmetry of Hamilton's equations of motion. Many books present 
the Hamiltonian only for this special simplified case where it is holonomic, conservative, and generalized 
coordinates are used. 


8.3.1 Canonical equations of motion 


Hamilton's equations of motion, summarized in equations 8.25 — 27, use either a minimal set of generalized 
coordinates, or the Lagrange multiplier terms, to account for holonomic constraints, or generalized forces 
QPO to account for non-holonomic or other forces. Hamilton’s equations of motion usually are called the 
canonical equations of motion. Note that the term “canonical” has nothing to do with religion or canon 
law; the reason for this name has bewildered many generations of students of classical mechanics. The 
term was introduced by Jacobi in 1837 to designate a simple and fundamental set of conjugate variables 
and equations. Note the symmetry of Hamilton’s two canonical equations, plus the fact that the canonical 
variables px, qx are treated as independent canonical variables. The Lagrange mechanics coordinates (q, q,t) 
are replaced by the Hamiltonian mechanics coordinates (q, p,t), where the conjugate momenta p are taken 
to be independent of the coordinate q. 

Lagrange was the first to derive the canonical equations but he did not recognize them as a basic set of 
equations of motion. Hamilton derived the canonical equations of motion from his fundamental variational 
principle, chapter 9.2, and made them the basis for a far-reaching theory of dynamics. Hamilton’s equations 
give 2s first-order differential equations for px, qk for each of the s = n — m degrees of freedom. Lagrange’s 
equations give s second-order differential equations for the s independent generalized coordinates qk, qx. 

It has been shown that H(p,q,t) and L(q,q,t) are the Legendre transforms of each other. Although 
the Lagrangian formulation is ideal for solving numerical problems in classical mechanics, the Hamiltonian 
formulation provides a better framework for conceptual extensions to other fields of physics since it is written 
in terms of the fundamental conjugate coordinates, q,p. The Hamiltonian is used extensively in modern 
physics, including quantum physics, as discussed in chapters 15 and 18. For example, in quantum mechanics 
there is a straightforward relation between the classical and quantal representations of momenta; this does 
not exist for the velocities. 

The concept of state space, introduced in chapter 3.3.2, applies naturally to Lagrangian mechanics since 
(q, q) are the generalized coordinates used in Lagrangian mechanics. The concept of Phase Space, introduced 
in chapter 3.3.3, naturally applies to Hamiltonian phase space since (p,q) are the generalized coordinates 
used in Hamiltonian mechanics. 


8.4. HAMILTONIAN IN DIFFERENT COORDINATE SYSTEMS 203 


8.4 Hamiltonian in different coordinate systems 


Prior to solving problems using Hamiltonian mechanics, it is useful to express the Hamiltonian in cylindrical 
and spherical coordinates for the special case of conservative forces since these are encountered frequently 
in physics. 
8.4.1 Cylindrical coordinates p, z,@ 
Consider cylindrical coordinates p, z, p. Expressed in cartesian coordinate 

= pcosd (8.31) 


psing 
= Z 


Using appendix table C.3, the Lagrangian can be written in cylindrical coordinates as 


L=T-U= Z (P + pb" +2) —U(o,2,6) (8.32) 
The conjugate momenta are 
Po = Z =mi (8.33) 
be = mh (8.34) 
ps = mi (8.35) 


Assume a conservative force, then H is conserved. Since the transformation from cartesian to non- 
rotating generalized cylindrical coordinates is time independent, then H = E. Then using (8.32 — 8.35) gives 
the Hamiltonian in cylindrical coordinates to be 


H (q,p,t) Y Pdi — L(q,4,t) (8.36) 


i : E mf. -2 ; 
= (pppt+peb+p.2)-~(p+p?d +2) +U(p,z,¢) 
2 


1 Py 
= 57 (a + 3 +) + U(p, z,¢) (8.37) 


2m 


The canonical equations of motion in cylindrical coordinates can be written as 


Rh ea os (8.38) 
De = 7% (8.39) 
p = y (8.41) 
$ = a (8.42) 
go o E (8.43) 


Note that if @ is cyclic, that is oS = 0, then the angular momentum about the z axis, pg, is a constant 


of motion. Similarly, if z is cyclic, then p, is a constant of motion. 
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8.4.2 Spherical coordinates, 7,6, ¢ 


Appendix table C.4 shows that the spherical coordinates are related to the cartesian coordinates by 


= rsin0cosg (8.44) 
= rsin0singo 
z = rcosé 
The Lagrangian is 
Leta Z (i? 47°60" +r? sin? 06 ) —U(r06) (8.45) 
The conjugate momenta are 
L 
Pro = E = mr (8.46) 
OL . 
po = = =m? (8.47) 
00 
L . 
Po = On = mr? sin? 09 (8.48) 
do 


Assuming a conservative force then H is conserved. Since the transformation from cartesian to generalized 
spherical coordinates is time independent, then H = E. Thus using (8.46 — 8.48) the Hamiltonian is given 
in spherical coordinates by 


= (por + poô + pod) — > G 496° +r? sin? 00 ) + U(r, 6, ¢) (8.50) 
1 2, P3 Ps 
zoe, Sl Po 51 
2m ( ss r? 7 r? sin? 0 TURI N 
Then the canonical equations of motion in spherical coordinates are 
, ðH 1 f> B aU 
= a = — ELA 52 
dE Or mr? (z y sin? 0 Or (2:52) 
; OH 1 [pjcos?\ ðU 
= === =— 8.53 
dd 00 mr? ( sin? 0 00 ( ) 
OH OU 
; = = — = — — . 4 
De 06 D6 (8.54) 
, OH Pr 
= D a (8.55) 
; OH po 
= == 8.56 
g Ope mr? ( ) 
ps E (8.57) 


Ops mr? sin? 6 


Note that if the coordinate ¢ is cyclic, that is a = 0 then the angular momentum pg is conserved. Also 
if the 0 coordinate is cyclic, and pg = 0, that is, there is no change in the angular momentum perpendicular 
to the z axis, then pg is conserved. 

An especially important spherically-symmetric Hamiltonian is that for a central field. Central fields, such 
as the gravitational or Coulomb fields of a uniform spherical mass, or charge, distributions, are spherically 
symmetric and then both 0 and ¢ are cyclic. Thus the projection of the angular momentum pg about the z 
axis is conserved for these spherically symmetric potentials. In addition, since both pg and pg, are conserved, 
then the total angular momentum also must be conserved as is predicted by Noether’s theorem. 
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8.5 Applications of Hamiltonian Dynamics 


The equations of motion of a system can be derived using the Hamiltonian coupled with Hamilton’s equations 


of motion, that is, equations 8.25 — 8.27. 
Formally the Hamiltonian is constructed from the Lagrangian. That is 


1) Select a set of independent generalized coordinates q; 
2) Partition the active forces. 

3) Construct the Lagrangian L(q;, qi, t) 

4) Derive the conjugate generalized momenta via p; = oe 
5) Knowing L, qi, pi derive H =>, pig; — L 

6) Derive dy = ZE and pj = PRD Y, A Soe + QEXC, 


This procedure appears to be unnecessarily complicated compared to just using the Lagrangian plus 
Lagrangian mechanics to derive the equations of motion. Fortunately the above lengthy procedure often can 
be bypassed for conservative systems. That is, if the following conditions are satisfied; 

i) L = T(q) — U (q), that is, U (q) is independent of the velocity q. 

it) the generalized coordinates are time independent. 
then it is possible to use the fact that H = T +U = E. 

The following five examples illustrate the use of Hamiltonian mechanics to derive the equations of motion. 


8.1 Example: Motion in a uniform gravitational field 


Consider a mass m in a uniform gravitational field acting in the —z direction. The Lagrangian for this 
simple case is 
1 
L= ¿mi +y? + 2) — mgz 
Therefore the generalized momenta are Py = oh = MI, Py = oe = MY, pz = oh = mz. The corresponding 
Hamiltonian H is 
H = Y pi- L= p:i + pyy + pit - L 


a 


2 2 2 1 2 2 2 1 2 2 2 
m 2\m m 


m m m 2\m m m 


Note that the Lagrangian is not explicitly time dependent, thus the Hamiltonian is a constant of motion. 
Hamilton’s equations give that 


pa ae zp 0 
OP. m Pe = ag T 
; OH y OH 
— SS — = = Ss = 0 
id py m Py Oy 
e o Oe a 
Op. m PaT Oz 9 
Combining these gives that & =0, y =0,2 = —g. Note that the linear momenta pz and py are constants 


of motion whereas the rate of change of pz is given by the gravitational force mg. Note also that H = T +U 
for this conservative system. 


8.2 Example: One-dimensional harmonic oscillator 


Consider a mass m subject to a linear restoring force with spring constant k. The Lagrangian L = T — U 


equals 


Therefore the generalized momentum is 
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The Hamiltonian H is 


Sidi -L=pyt=L 


a 


PaPa lpr ae? = lps 


m 2m 2 


12 
= 3m ya 


Note that the Lagrangian is not explicitly time dependent, thus the Hamiltonian will be a constant of motion. 
Hamilton’s equations give that 


ma OH Pz 
Op. m 
or 
In addition 


Combining these gives that 


k 
t+—r=0 
m 


which is the equation of motion for the harmonic oscillator. 


8.3 Example: Plane pendulum 


The plane pendulum, in a uniform gravitational field g, is an interesting system to consider. There is 
only one generalized coordinate, 0 and the Lagrangian for this system is 


1 A 
L= smo" + mgl cos 0 
The momentum conjugate to 0 is 


00 
which is the angular momentum about the pivot point. 
The Hamiltonian is 


Po mi?0 


Po 


: l 
H = S pidi —L=p9-L= ¿muy — mgl cos 9 = 2mi2 7 mgl cos 6 


Hamilton’s equations of motion give 


3H _ P 
Opp ml? 
0H 


Note that the Lagrangian and Hamiltonian are not explicit functions of time, therefore they are conserved. 
Also the potential is velocity independent and there is no coordinate transformation, thus the Hamiltonian 
equals the total energy, that is 


Db 


= zm mgl cos? = E 


where E is a constant of motion. Note that the angular momentum pg is not a constant of motion since pg 
explicitly depends on 0. 


8.5. APPLICATIONS OF HAMILTONIAN DYNAMICS 207 


The solutions for the plane pendulum on a (0, pọ) phase di- 
agram, shown in the adjacent figure, illustrate the motion. The 
upper phase-space plot shows the range (0 = +7, po). Note that 
the 0 = +r and —r correspond to the same physical point, that is 
the phase diagram should be rolled into a cylinder connected along 
the dashed lines. The lower phase space plot shows two cycles for 
0 to better illustrate the cyclic nature of the phase diagram. The 
corresponding state-space diagram is shown in figure 3.4. The 
trajectories are ellipses for low energy —mgl < E < mgl corre- 
sponding to oscillations of the pendulum about 0 = 0. The center 
of the ellipse (0,0) is a stable equilibrium point for the oscillation. Elliptic point 
However, there is a phase change to rotational motion about the Hyperbolic point 
horizontal axis when |E| > mgl, that is, the pendulum swings 
around a circle continuously, i.e. it rotates continuously in one 
direction about the horizontal axis. The phase change occurs at 
E = mgl. and is designated by the separatrix trajectory. 

The plot of pọ versus 0 for the plane pendulum is better pre- 
sented on a cylindrical phase space representation since 0 is a 
cyclic variable that cycles around the cylinder, whereas pa oscil- 
lates equally about zero having both positive and negative values. 
When wrapped around a cylinder then the unstable and stable (b) 
equilibrium points will be at diametrically opposite locations on Phase-space diagrams for the plane 
the surface of the cylinder at pọ = 0. For small oscillations pendulum. The separatrix (bold line) 
about equilibrium, also called librations, the correlation between separates the oscillatory solutions from 


Oscillation (mg? <E<mg?) 
Separatrix (E=mgf) 
Rotation (E>mg?) 


po and 0 is given by the clockwise closed ellipses wrapped on the the rolling solutions. The upper (a) 
cylindrical surface, whereas for energies |E| > mgl the positive shows one complete cycle while the lower 
pe corresponds to counterclockwise rotations while the negative (b) shows two complete cycles. 


pe corresponds to clockwise rotations. 


8.4 Example: Hooke’s law force constrained to the surface of a cylinder 


Consider the case where a mass m is attracted by a 
force directed toward the origin and proportional to the 
distance from the origin. Determine the Hamiltonian Z 
if the mass is constrained to move on the surface of a 


cylinder defined by 

a? +y? = R? 
It is natural to transform this problem to cylindrical co- A > 
ordinates p,z,0. Since the force is just Hooke’s law N 

F = —kr Va 
y 
the potential is the same as for the harmonic oscillator, i eae 
that is i i 
U=“k 2_ Tk 2 2 
ahr” = gE + 27) i 


This is independent of 0, and thus 0 is cyclic. 
In cylindrical coordinates the velocity is 


Mass attracted to origin by a, proportional to 
distance from origin with the motion constrained 


aes. Pi? 43 to the surface of a cylinder. 


Confined to the surface of the cylinder means that 


p= R 
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Then the Lagrangian simplifies to 
1 . 1 
L=T-U=5m G + 2) — ZKR? + 2?) 


The generalized coordinates are 0,z and the corresponding generalized momenta are 


ae a mR?0 (a) 
OL ; 


The system is conservative, and the transformation from rectangular to cylindrical coordinates does not 
depend explicitly on time. Therefore the Hamiltonian is conserved and equals the total energy. That is 


H =>) piġi L= p + p FONR 4 2) = E 
, 2mR? 2m 2 


The equations of motion then are given by the canonical equations 


x _ OH , OH | Pe 

E o Re (0) 
dy O . 0H pz 

Pa a he Ga dl 


Equation (a) and (c) imply that 


OL . 
Po = — = mR0 = constant 


00 


Thus the angular momentum about the axis of the cylinder is conserved, that is, it is a cyclic variable. 
Combining equations (b) and (d) implies that 


k 
zZ4+—z=0 
m 


This is the equation for simple harmonic motion with angular frequency w = + / E. The symmetries imply 
that this problem has the same solutions for the z coordinate as the harmonic oscillator, while the 0 coordinate 
moves with constant angular velocity. 


8.5 Example: Electron motion in a cylindrical magnetron 


A magnetron comprises a hot cylindrical wire cathode that emits electrons and is at a high negative 
voltage. It is surrounded by a larger diameter concentric cylindrical anode at ground potential. A uniform 
magnetic field runs parallel to the cylindrical axis of the magnetron. The electron beam excites a multiple set 
of microwave cavities located around the circumference of the cylindrical wall of the anode. The magnetron 
was invented in England during World War 2 to generate microwaves required for the development of radar. 

Consider a non-relativistic electron of mass m and charge —e in a cylindrical magnetron moving between 
the central cathode wire, of radius a at a negative electric potential —ġo, and a concentric cylindrical anode 
conductor of radius R which has zero electric potential. There is a uniform constant magnetic field B parallel 
to the cylindrical axis of the magnetron. 

Using SI units and cylindrical coordinates (r,6,z) aligned with the axis of the magnetron, the electromag- 
netic force Lagrangian, given in chapter 6.10, equals 

L= Imi? + e(6—#- A) 
The electric and vector potentials for the magnetron geometry are 
In(#) 
m) 


1 
A ¿Préo 


$ = -b 
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Thus expressed in cylindrical coordinates the Lagrangian equals 
al .2 292, 32 E 2; 
L=¿m/(+ + r°0 +2) +06 ¿eBr?0 


The generalized momenta are 


O dto 
Pr = ae 
OL ap 1 2 
Po Ab mr*0 ¿2Br 
OL : 
Be Ne, 
£ Oz 


Note that the vector potential A contributes an additional term to the angular momentum po. 
Using the above generalized momenta leads to the Hamiltonian 


H = pr+p00+p,2-L 


1 ; 1 ; 
= 5m (? 4720 4 r) - eb + seBrd 


2 2 
Pr 1 1 2 
= B 

am! 2mp? (0+ 5¢ 7) = 


1 2 
pt E + er) +p 


2 
Pz 
2m ee 


1 


2m =e 


r 


Note that the Hamiltonian is not an explicit function of time, therefore it is a constant of motion which 
equals the total energy. 


1 1 A 
H = — |p? + PO 4 ZeBr +p|-ep=E 
2m r 2 
Since pi = more and if H is not an explicit function of q;, then p; =0, that is, p; is a constant of motion. 
Thus pe and p, are constants of motion. 
Consider the initial conditions r=a,+=0=2=0. Then 
OL . 1 1 
Po = y] = mr?0 = ¿Br = -3eBa” 
Pz = 0 
1 p 1 E In(4) 
H = — |p? + (= + -eB e —B- 
2m |?" $ ( a 2° r) els Pon) epi 


Note that at r = R, then p, is given by the last equation since the Hamiltonian equals a constant egy. That 
is, assuming that a << R then 


1 
p? = 2medy — (5eBR)” 


2 /2mp 
B: = =| — 
R e 


(yan = (B2 — BY) GerY 


Note that if B < Be then p, is real at r = R. However, if B > Be then p, is imaginary at r = R 
implying that there must be a maximum orbit radius ro for the electron where ro < R. That is, the electron 
trajectories are confined spatially to coaxial cylindrical orbits concentric with the magnetron electromagnetic 
fields. These closed electron trajectories excite the microwave cavities located in the nearby outer cylindrical 
wall of the anode. 


Define a critical magnetic field by 


then 
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8.6 Routhian reduction 


Noether’s theorem states that if the coordinate q; is cyclic, and if the Lagrange multiplier plus generalized 
force contributions for the jt” coordinates are zero, then the canonical momentum of the cyclic variable, Pj, 18 
a constant of motion as is discussed in chapter 7.3. Therefore, both (q;, pj) are constants of motion for cyclic 
variables, and these constant (qj, pj) coordinates can be factored out of the Hamiltonian H(p,q,t). This 
reduces the number of degrees of freedom included in the Hamiltonian. For this reason, cyclic variables are 
called ignorable variables in Hamiltonian mechanics. This advantage does not apply to the (q,;, q;) variables 
used in Lagrangian mechanics since q is not a constant of motion for a cyclic coordinate. The ability 
to eliminate the cyclic variables as unknowns in the Hamiltonian is a valuable advantage of Hamiltonian 
mechanics that is exploited extensively for solving problems, as is described in chapter 15. 


It is advantageous to have the ability to exploit both the Lagrangian and Hamiltonian formulations simul- 
taneously when handling systems that involve a mixture of cyclic and non-cyclic coordinates. The equations 
of motion for each independent generalized coordinate can be derived independently of the remaining general- 
ized coordinates. Thus it is possible to select either the Hamiltonian or the Lagrangian formulations for each 
generalized coordinate, independent of what is used for the other generalized coordinates. Routh[Rou1860] 
devised an elegant, and useful, hybrid technique that separates the cyclic and non-cyclic generalized coor- 
dinates in order to simultaneously exploit the differing advantages of both the Hamiltonian and Lagrangian 
formulations of classical mechanics. The Routhian reduction approach partitions the bye pidi kinetic energy 
term in the Hamiltonian into a cyclic group, plus a non-cyclic group, i.e. 


AQ, Gri Piss Past) = Y pigi- L= Y” pidi + y pidi— L (8.58) 
i=1 


cyclic noncyclic 


Routh’s clever idea was to define a new function, called the Routhian, that include only one of the two 
partitions of the kinetic energy terms. This makes the Routhian a Hamiltonian for the coordinates for which 
the kinetic energy terms are included, while the Routhian acts like a negative Lagrangian for the coordinates 
where the kinetic energy term is omitted. This book defines two Routhians. 


m 
Radios 4n; Åi, -s Qs; Ps+1, -3 Pn; t) = 5 pidi — L (8.59) 
cyclic 
s 
Rnoncycliclqi, ---; qni P1, sd, Psi s+1, eat) = 5 Pidi — L (8.60) 
noncyclic 


The first, Routhian, called Reyctic, includes the kinetic energy terms only for the cyclic variables, and behaves 
like a Hamiltonian for the cyclic variables, and behaves like a Lagrangian for the non-cyclic variables. The 
second Routhian, called Rnon—cyclic, includes the kinetic energy terms for only the non-cyclic variables, and 
behaves like a Hamiltonian for the non-cyclic variables, and behaves like a negative Lagrangian for the cyclic 
variables. These two Routhians complement each other in that they make the Routhian either a Hamiltonian 
for the cyclic variables, or the converse where the Routhian is a Hamiltonian for the non-cyclic variables. 
The Routhians use (q;, i) to denote those coordinates for which the Routhian behaves like a Lagrangian, and 
(qi, pi) for those coordinates where the Routhian behaves like a Hamiltonian. For uniformity, it is assumed 
that the degrees of freedom between 1 < i < s are non-cyclic, while those between s+1 <i < n are ignorable 
cyclic coordinates. 

The Routhian is a hybrid of Lagrangian and Hamiltonian mechanics. Some textbooks minimize discussion 
of the Routhian on the grounds that this hybrid approach is not fundamental. However, the Routhian is 
used extensively in engineering in order to derive the equations of motion for rotating systems. In addition 
it is used when dealing with rotating nuclei in nuclear physics, rotating molecules in molecular physics, and 
rotating galaxies in astrophysics. The Routhian reduction technique provides a powerful way to calculate 
the intrinsic properties for a rotating system in the rotating frame of reference. The Routhian approach is 
included in this textbook because it plays an important role in practical applications of rotating systems, plus 
it nicely illustrates the relative advantages of the Lagrangian and Hamiltonian formulations in mechanics. 
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8.6.1 Reycic - Routhian is a Hamiltonian for the cyclic variables 


The cyclic Routhian Reyctic is defined assuming that the variables between 1 < i < s are non-cyclic, where 
s =n—m, while the m variables between s+1 < i < n are ignorable cyclic coordinates. The cyclic Routhian 
Reyctic expresses the cyclic coordinates in terms of (q, p) which are required for use by Hamilton’s equations, 
while the non-cyclic variables are expressed in terms of (q,q) for use by the Lagrange equations. That is, 
the cyclic Routhian Reyetic is defined to be 


m 
Royaiia (di, i Qn; Gina de Digas Prit) =>, pidi — L (8.61) 


cyclic 


where the summation omits pidi is over only the m cyclic variables s+1 <i < n. Note that the Lagrangian 
can be split into the cyclic and the non-cyclic parts 


m 
Reyetic(Q; sey Un; di, isad ds;Ps+1> ee Pn; t) = 5 Didi a Leyclic T Lnoncyclic (8.62) 


cyclic 


The first two terms on the right can be combined to give the Hamiltonian Hoyctic for only the m cyclic 
variables, i= s + 1,s + 2,..,n, that is 


Reyctic(qı, «3 Un; 1, es ds;Ps+1> 3 Pn; t) = A eyctic a Lnoncyclic (8.63) 
The Routhian Reyetie(Q1, +++) qn; 1, ++) Fs} Ps+1, +++) Pn; t) also can be written in an alternate form 
m n S 
Reyclic(q1; +5 Inj Åi, +5 Is} Ps+1; + Pnit) = 5 pidi — L =X pig; == 5 piġ (8.64) 
cyclic i=l noncyclic 


sS 
= H- Y, pġ (8.65) 
noncyclic 
which is expressed as the complete Hamiltonian minus the kinetic energy term for the noncyclic coordinates. 
The Routhian Reyclic behaves like a Hamiltonian for the m cyclic coordinates and behaves like a negative 
Lagrangian Lrnoncyclic for all the s = n — m noncyclic coordinates i = 1, 2, ..., s. Thus the equations of motion 
for the s non-cyclic variables are given using Lagrange’s equations of motion, while the Routhian behaves 
like a Hamiltonian Heyclic for the m ignorable cyclic variables i = s + 1,...,n. 
Ignoring both the Lagrange multiplier and generalized forces, then the partitioned equations of motion 
for the non-cyclic and cyclic generalized coordinates are given in Table 8.1. 


Table 8.1; Equations of motion for the Routhian Reyelic 
Lagrange equations Hamilton equations | 
Coordinates Noncyclic: 1<i<s | Cyclic: (s+1)<i<n 


OReyelie_  OLnonmeyclie OReyelio __ 
ðqi Ogi 04; 
Equations of motion 


OReyctic _ _ OL noncyclic OReyclic — 


Odi 0d; Op; 


Thus there are m cyclic (ignorable) coordinates (q, p)s+1,...., (q, P),, Which obey Hamilton’s equations of 
motion, while the the first s = n—m non-cyclic (non-ignorable) coordinates (q, 4), ,...., (4,4), for ¿=1,2,...,s 
obey Lagrange equations. The solution for the cyclic variables is trivial since they are constants of motion 
and thus the Routhian Reyclic has reduced the number of equations of motion that must be solved from n to 
the s = n — m non-cyclic variables. This Routhian provides an especially useful way to reduce the number 
of equations of motion for rotating systems. 

Note that there are several definitions used to define the Routhian, for example some books define this 
Routhian as being the negative of the definition used here so that it corresponds to a positive Lagrangian. 
However, this sign usually cancels when deriving the equations of motion, thus the sign convention is unim- 
portant if a consistent sign convention is used. 
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8.6.2 Rnoneyciio - Routhian is a Hamiltonian for the non-cyclic variables 


The non-cyclic Routhian Rnoncyctic complements Reyctic. Again the generalized coordinates between 1 <i < 
s are assumed to be non-cyclic, while those between s+1 < i < n are ignorable cyclic coordinates. However, 
the expression in terms of (q,p) and (q,q) are interchanged, that is, the cyclic variables are expressed in 
terms of (qg,q) and the non-cyclic variables are expressed in terms of (q,p) which is opposite of what was 
used for Reyclic- 


s 

Rnoncyctic(Q1; vey Qn; P1, +) Ds} Qs+415 -3 In t) = 5 DiGi — Lnoncyclic = Leyctic (8.66) 
noncyclic 

= Anoncyclic J Leyelic (8.67) 


It can be written in a frequently used form 


s n m 
Porra 4ni Piss ee Gee Ôn; t) = 5 Pidi — L =X vidi =L- 5 Didi 
i=1 


noncyclic cyclic 


= H- 5 Didi (8.68) 
cyclic 


This Routhian behaves like a Hamiltonian for the s non-cyclic variables which are expressed in terms of q 
and p appropriate for a Hamiltonian. This Routhian writes the m cyclic coordinates in terms of q, and q, 
appropriate for a Lagrangian, which are treated assuming the Routhian Reyclic is a negative Lagrangian for 
these cyclic variables as summarized in table 8.2. 


Table 8.2; Equations of motion for the Routhian Rnoncyctic 


Hamilton equations Lagrange equations 
Coordinates Noncyclic: 1<i<s | Cyclic: (s+1)<i<n 
ORnoncyclic _ _ ORnoneyelie_.  OLeyelie 
Oui = Pi ðqi = Ogi 
Equations of motion 
ORnoncyclic __ 5. ORnoneyelie _  OLeyclic 
ðp TU OG O: 


This non-cyclic Routhian Rnoncyclic is especially useful since it equals the Hamiltonian for the non-cyclic 
variables, that is, the kinetic energy for motion of the cyclic variables has been removed. Note that since the 
cyclic variables are constants of motion, then Rnroncyclic is a constant of motion if H is a constant of motion. 
However, Rnoncyclic does not equal the total energy since the coordinate transformation is time dependent, 
that is, Rnoncyctic corresponds to the energy of the non-cyclic parts of the motion. For example, when used 
to describe rotational motion, Rnoncyclic Corresponds to the energy in the non-inertial rotating body-fixed 
frame of reference. This is especially useful in treating rotating systems such as rotating galaxies, rotating 
machinery, molecules, or rotating strongly-deformed nuclei as discussed in chapter 12.9. 

The Lagrangian and Hamiltonian are the fundamental algebraic approaches to classical mechanics. The 
Routhian reduction method is a valuable hybrid technique that exploits a trick to reduce the number of 
variables that have to be solved for complicated problems encountered in science and engineering. The 
Routhian Rnoneyclic provides the most useful approach for solving the equations of motion for rotating 
molecules, deformed nuclei, or astrophysical objects in that it gives the Hamiltonian in the non-inertial 
body-fixed rotating frame of reference ignoring the rotational energy of the frame. By contrast, the cyclic 
Routhian Reyctic is especially useful to exploit Lagrangian mechanics for solving problems in rigid-body 
rotation such as the Tippe Top described in example 13.13. 

Note that the Lagrangian, Hamiltonian, plus both the Rponcyclic and Rnoncyclic Routhian’s, all are scalars 
under rotation, that is, they are rotationally invariant. However, they may be expressed in terms of the 
coordinates in either the stationary or a rotating frame. The major difference is that the Routhian includes 
only subsets of the kinetic energy term )> 5 DID: The relative merits of using Lagrangian, Hamiltonian, and 
both the Rnoneyctic and Rnoncyclic Routhian reduction methods, are illustrated by the following examples. 
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8.6 Example: Spherical pendulum using Hamiltonian mechanics 


The spherical pendulum provides a simple test case for compar- 
ison of the use of Lagrangian mechanics, Hamiltonian mechanics, 
and both approaches to Routhian reduction. The Lagrangian me- 
chanics solution of the spherical pendulum is described in example 
6.10. The solution using Hamiltonian mechanics is given in this PS 
example followed by solutions using both of the Routhian reduction 
approaches. 

Consider the equations of motion of a spherical pendulum of g b 
mass m and length b. The generalized coordinates are 0, since 
the length is fired at r = b. The kinetic energy is 


1 2 1 .2 
P= sino + ¿mo sin? 04 


The potential energy U = —mgbcosÓ giving that 
Spherical pendulum 


SPE l 32 1 jo. 3332 
L(r,6,,7,9,0) = ¿mo 0 + smb sin 0¢ +mgbcosé 


The generalized momenta are 
OL ; 
mb?0 Pe = = = mb? sin? 0 
do 
Since the system is conservative, and the transformation from rectangular to spherical coordinates does not 
depend explicitly on time, then the Hamiltonian is conserved and equals the total energy. The generalized 
momenta allow the Hamiltonian to be written as 


OL | 


Pe = g 


2 2 
Pp P 
H(r,0, $, pr: Pope) = zoga + zg 7g ~ mgbeoss 


The equations of motion are 
OH pá cos 6 


b= a gaa mgb sin 0 (a) 
OH 
MAA 0 (b) 
p oH Pe 
app me? a 
OH = Po 


= Ope ~ mb2 sin? 6 


Take the time derivative of equation (c) and use (a) to substitute for Pg gives that 


2 
po Cos 0 U aryen 
ed BOO te) 


6— 
Note that equation (b) shows that @ is a cyclic coordinate. Thus 
Pe = mb? sin? 0¢ = constant 


that is the angular momentum about the vertical axis is conserved. Note that although pg is a constant of 
motion, 6 = — tos is a function of 0, and thus in general it is not conserved. There are various solutions 
depending on the initial conditions. If pg = 0 then the pendulum is just the simple pendulum discussed 
previously that can oscillate, or rotate in the 0 direction. The opposite extreme is where pọ = 0 where the 
pendulum rotates in the ọ direction with constant 0. In general the motion is a complicated coupling of the 


0 and p motions. 
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8.7 Example: Spherical pendulum using Reyctic(T, 0, >, 7, 9, Do) 


The Lagrangian for the spherical pendulum is 
hides E 1 92 1 2.2012 
L(r,0,p,7,0,0) = ¿mo 0 + ¿mb sin 0 + mgb cos 


Note that the Lagrangian is independent of œ, therefore ¢ is an ignorable variable with 


OL OH 


Pe = 55 De 


Therefore pg is a constant of motion equal to 


OL 


= — = mb? sin? 04 
Po 06 $ 
The Routhian Reycticlr, 9, e,t, 9, pg) equals 
Reyctic(t, 0, $, č, 9; po) = Poe -L 


1 P 1 2 : 
= — ¿mo + ¿mi sin? TA + mgb cos 6 — mb? sin? o% 


1.53811 2% 
= —2mb0" + -—— + mgbcos6 
2 2mbsin o | 
The Routhian Reyctic(r, 0, >, Y, Ò, po) behaves like a Hamiltonian for fp, and like a Lagrangian L' = —Reyctic 
for 0. Use of Hamilton’s canonical equations for ọ give 


¿$ = OReyclic = Po 
Ope mb? sin? 6 
A OReyclic 
= = AAN 0 
Po ad 


These two equations show that pg is a constant of motion given by 
mb? sin? 66 = pg = constant (a) 


Note that the Hamiltonian only includes the kinetic energy for the o motion which is a constant of motion, 
but this energy does not equal the total energy. This solution is what is predicted by Noether's theorem due 
to the symmetry of the Lagrangian about the vertical ¢ axis. 

Since Reyctic(r, 9, ġ, t, 0, pọ) behaves like a Lagrangian for 0 then the Lagrange equation for @ is 


d OReyclic OReyclic 


de ao pe = 


AgL = 


where the negative sign of the Lagrangian in Reycticlr, 0, 0,7; Ò, po) cancels. This leads to 


2 
24  pycosé : 
mb 0 = mb? sin? @ = mgb sin 9 
that is 4 ; 
COS 
Ed +2sin9=0 (8) 


— m2b4sin? 6b 


This result is identical to the one obtained using Lagrangian mechanics in example 6.10 and Hamiltonian 
mechanics given in example 8.6. The Routhian Reyctic simplified the problem to one degree of freedom 0 by 
absorbing into the Hamiltonian the ignorable cyclic ¢ coordinate and its conserved conjugate momentum pg. 
Note that the central term in equation B is the centrifugal term which is due to rotation about the vertical 
axis. This term is zero for plane pendulum motion when py = 0. 
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8.8 Example: Spherical pendulum using Rnoncycticlr, 0, O, Pr, Do.) 


For a rotational system the Routhian Ryoncyctic(T, 0, P, Pr, Po, P) also can be used to project out the Hamil- 
tonian for the active variables in the rotating body-fixed frame of reference. Consider the spherical pendulum 
where the rotating frame is rotating with angular velocity p. The Lagrangian for the spherical pendulum is 


L(r, 0, d, Ts 0, o) == mo + A sin? A + mgb cos 0 


Note that the Lagrangian is independent of fp, therefore $ is an ignorable variable with 


; ðL OH 0 
1009 86 
Therefore pg is a constant of motion equal to 
OL . 
= — = mb? sin? 6 
Pa = a5 $ 
The total Hamiltonian is given by 
p3 Po 


H(r, 0, $, Pr; po, pe) = Y Pid: L= 5 mgb cos 0 


mb? * 2mb? sin? 0 


The Routhian for the rotating frame of reference Hrot is given by equation 8.68, that is 


Rnoneyclic[”, 0, Q, Pr, PO, 6) E Sidi g Psd —-L=H- Psd 
i=1 
2 2 
Po Po 
= b 0 — 
2mb? A 2mb? sin? 0 da 
p3 EOE Oe 
Sega ¿mo sinf 0¢ — mgb cos 0 (y) 


This behaves like a negative Lagrangian for ¢ and a Hamiltonian for 0. The conjugate momenta are 
OL ORnoncyclic 


Pe = g : = mb? sin? bọ 
° ð$ ad 
ù = OL 2 _ Rnoncyclic =0 
j 06 06 
that is, pẹ is a constant of motion. 
Hamilton's equations of motion give 
0 = ORnoncyclic = Po (8) 
Ope mb? 
à ORnoncyclic Ps cos 0 : 
= = = + mgbsin 0 € 
di 00 mb? sin? 0 si (9 
Equation 6 gives that 
dog Bo 
Ot mb? 
Inserting this into equation € gives 
2 cos 0 
ut + 2 sing =0 


m2b4 sin? @ b 
which is identical to the equation of motion a derived using Reyctic. The Hamiltonian in the rotating frame 
is a constant of motion given by y,but it does not include the total energy. 

Note that these examples show that both forms of the Routhian, as well as the complete Lagrangian 
formalism, shown in example 6.10, and complete Hamiltonian formalism, shown in example 8.6, all give the 
same equations of motion. This illustrates that the Lagrangian, Hamiltonian, and Routhian mechanics all 
give the same equations of motion and this applies both in the static inertial frame as well as a rotating frame 
since the Lagrangian, Hamiltonian and Routhian all are scalars under rotation, that is, they are rotationally 
invariant. 
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8.9 Example: Single particle moving in a vertical plane under the influence of 
an inverse-square central force 


The Lagrangian for a single particle of mass m, moving in a vertical plane and subject to a central inverse 
square central force, is specified by two generalized coordinates, r, and 0. 


OMe, apr E 
L=5( Pro) 


The ignorable coordinate is 0, since it is cyclic. Let the constant conjugate momentum be denoted by pọ = 


oh = mr?0. Then the corresponding cyclic Routhian is 
2 
; p T 
Reyctic(r, 0, 7,po) = pð — L = j gn 


This Routhian is the equivalent one-dimensional potential U(r) minus the kinetic energy of radial motion. 
Applying Hamilton's equation to the cyclic coordinate 0 gives 
Po 


po =0 5=0 
mr 


implying a solution 
po =mr?*0 =1 


where the angular momentum l is a constant. 
The Lagrange-Euler equation can be applied to the non-cyclic coordinate r 
— d OReyclic OReyetic 
dt OF Or 


where the negative sign of Reyctic cancels. This leads to the radial solution 


A,L =0 


where pg = 1 which is a constant of motion in the centrifugal term. Thus the problem has been reduced to a 
one-dimensional problem in radius r that is in a rotating frame of reference. 


8.7 Variable-mass systems 


Lagrangian and Hamiltonian mechanics assume that the total mass and energy of the system are conserved. 
Variable-mass systems involve transferring mass and energy between donor and receptor bodies. However, 
such systems still can be conservative if the Lagrangian or Hamiltonian include all the active degrees of 
freedom for the combined donor-receptor system. The following examples of variable mass systems illustrate 
subtle complications that occur handling such problems using algebraic mechanics. 


8.7.1 Rocket propulsion: 


Newtonian mechanics was used to solve the rocket problem in chapter 2.12.6. The equation of motion 
(2.113) relating the rocket thrust Fes to the rate of change of the momentum separated into two terms, 


Fog = Dy = my + my (8.69) 
The first term is the usual mass times acceleration, while the second term arises from the rate of change of 


mass times the velocity. The equation of motion for rocket motion is easily derived using either Lagrangian 
or Hamiltonian mechanics by relating the rocket thrust to the generalized force QP%C. 


8.7. VARIABLE-MASS SYSTEMS 217 


8.7.2 Moving chains: 


The motion of a flexible, frictionless, heavy chain that is falling in a gravitational field, often can be split into 
two coupled variable-mass partitions that have different chain-link velocities. These partitions are coupled 
at the moving intersection between the chain partitions. That is, these partitions share time-dependent 
fractions of the total chain mass. Moving chains were discussed first by Caley in 1857[Cay1857] and since 
then the moving chain problem has had a controversial history due to the frequent erroneous assumption 
that, in the gravitational field, the chain partitions fall with acceleration g rather than applying the correct 
energy conservation assumption for this conservative system. The following two examples of conservative 
falling-chain systems illustrate solutions obtained using variational principles applied to a single chain that 
is partitioned into two variable length sections.! 

Consider the following two possible scenarios for motion of a flexible, heavy, frictionless, chain located in 
a uniform gravitational field g. The first scenario is the “folded chain” system which assumes that one end 
of the chain is held fixed, while the adjacent free end is released at the same altitude as the top of the fixed 
arm, and this free end is allowed to fall in the constant gravitational field g. The second “falling chain”, 
scenario assumes that one end of the chain is hanging down through a hole in a frictionless, smooth, rigid, 
horizontal table, with the stationary partition of the chain sitting on the table surrounding the hole. The 
falling section of this chain is being pulled out of the stationary pile by the hanging partition. Both of these 
systems are conservative since it is assumed that the total mass of the chain is fixed, and no dissipative forces 
are acting. The chains are assumed to be inextensible, flexible, and frictionless, and subject to a uniform 
gravitational field g in the vertical y direction. In both examples, the chain, with mass M and length L, is 
partitioned into a stationary segment, plus a moving segment, where the mass per unit length of the chain 
ls u = Y. These partitions are strongly coupled at their intersection which propagates downward with time 
for the “folded chain” and propagates upward, relative to the lower end of the falling chain, for the “falling 
chain”. For the “folded chain”, the chain links are transferred from the moving segment to the stationary 
segment as the moving section falls. By contrast, for the “falling system”, the chain links are transferred 
from the stationary upper section to the moving lower segment of the chain. 


8.10 Example: Folded chain 


The folded chain of length L and mass-per-unit-length = Y hangs 
vertically downwards in a gravitational field g with both ends held initially 
at the same height. The fixed end is attached to a fixed support while the 
free end of the chain is dropped at time t = 0 with the free end at the same 
height and adjacent to the fixed end. Let y be the distance the falling free 
end is below the fixed end. Using an idealized one-dimensional assumption, 
the Lagrangian L is given by 


M 1 
y) = —(L—y)y? + Mg— (L? + 2Ly — y? Í 
Lly, y) =7¿UE- yy" + Mg (1 +2Ly- y) (8.70) 
where the bracket in the second term is the height of the center of mass of 
the folded chain with respect to the fixed upper end of the chain. 

The Hamiltonian is given by 


h L? + 2Ly — y? 
H(y,pr) = prý - Llud) = q, Mg! yy) (8.71) 


L- y) 4L 
where pr is the linear momentum of the right-hand arm of the folded chain. 
As shown in the discussion of the Generalized Energy Theorem, (chapters 7.8 and 7.9), when all the 
active forces are included in the Lagrangian and the Hamiltonian, then the total mechanical energy E is 
given by E = H. Moreover, both the Lagrangian and the Hamiltonian are time independent, since 


dE dH OL 

dt dt Ot 
Therefore the “folded chain” Hamiltonian equals the total energy, which is a constant of motion. Energy 
conservation for this system can be used to give 


(8.72) 


Discussions with Professor Frank Wolfs stimulated inclusion of these two examples of moving chains. 
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H ; 1 1 
qe) y- quo E? + 2Ly — y?) = qual? (8.73) 


Solve for y? gives 
2 QLy=y”) 


= .14 
LP (8.74) 
The acceleration of the falling arm, y, is given by taking the time derivative of equation 8.74 
i g (2Ly — y?) 
Y=9+ > 8.75 
2(L—W) dida 
The rate of change in linear momentum for the moving right side of the chain, pr, is given by 
; ahi, oo 2Ly — y? 
PR = MY + MRY = Mpg + mags ie (8.76) 


2(L—y) 
For this energy-conserving chain, the tension in the chain To at the fixed end of the chain is given by 


ae 


To 5 


(ty) + que? (8.77) 
Equations 8.74 and 8.76, imply that the tension To diverges to infinity when y => L. Calkin and March 
measured the y dependence of the chain tension at the support for the folded chain and observed the predicted 
y dependence. The maximum tension was ~ 25M g, which is consistent with that predicted using equation 8.77 
after taking into account the finite size and mass of individual links in the chain. This result is very different 
from that obtained using the erroneous assumption that the right arm falls with the free-fall acceleration g, 
which implies a maximum tension To = 2Mg. Thus the free-fall assumption disagrees with the experimental 
results, in addition to violating energy conservation and the tenets of Lagrangian and Hamiltonian mechanics. 
That is, the experimental result demonstrates unambiguously that the energy conservation predictions apply 
in contradiction with the erroneous free-fall assumption. 

The unusual feature of variable mass problems, such as the folded chain problem, is that the rate of change 
of momentum in equation 8.76 includes two contributions to the force and rate of change of momentum, 
that is, it includes both the acceleration term mpi plus the variable mass term my that accounts for the 
transfer of matter at the intersection of the moving and stationary partitions of the chain. At the transition 
point of the chain, moving links are transferred from the moving section and are added to the stationary 
subsection. Since this moving section is falling downwards, and the stationary section is stationary, then the 
transferred momentum is in a downward direction corresponding to an increased effective downward force. 
Thus the measured acceleration of the moving arm actually is faster than g. A related phenomenon is the 
loud cracking sound heard when cracking a whip. 


8.11 Example: Falling chain 


The “falling chain”, scenario assumes that one end of the chain is hang- 
ing down through a hole in a frictionless, smooth, rigid, horizontal table, 
with the stationary partition of the chain lying on the frictionless table sur- 
rounding the hole. The falling section of this chain is being pulled out of AA 
the stationary pile by the hanging partition. The analysis for the problem of 
the falling chain behaves differently from the folded chain. For the “falling- 
chain” let y be the falling distance of the lower end of the chain measured y 
with respect to the table top. The Lagrangian and Hamiltonian are given by 


Llyy) = ¿ui +G (8.78) 
INA: 

Py = g (8.79) 
2 2 

H adea ag (8.80) 
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The Lagrangian and Hamiltonian are not explicitly time dependent, and the Hamiltonian equals the initial 
total energy, Eo. Thus energy conservation can be used to give that 


1 : 
E = ¿uy(y" — gy) = Eo (8.81) 
Lagrange’s equation of motion gives 
; AI: 1. 
Py = My + My = myg + 5197 = Mg - To (8.82) 


The important difference between the folded chain and falling chain is that the moving component of the 
falling chain is gaining mass with time rather than losing mass. Also the tension in the chain Ty reduces the 
acceleration of the falling chain making it less than the free-fall value g. This is in contrast to that for the 
folded chain system where the acceleration exceeds g. 

The above discussion shows that Lagrangian and Hamiltonian can be applied to variable-mass systems if 
both the donor and receptor degrees of freedom are included to ensure that the total mass is conserved. 


8.8 Summary 
Hamilton’s equations of motion 


Inserting the generalized momentum into Jacobi’s generalized energy relation was used to define the 
Hamiltonian function to be 


H (q,p,t) =p: 4-L(q, 4, t) (8.3) 
The Legendre transform of the Lagrange-Euler equations, led to Hamilton’s equations of motion. 
OH 
ms = 8.25 
qj Op; ( ) 
a O (8.26) 
j ðq; ^ “dq; j 


The generalized energy equation 7.38 gives the time dependence 


dH(q,p,t) _ 3,09% pexol. \ Lla At) 
= = 2 >, M ag + Q; di | = Bt (8.27) 
where aH e aL as 
ot ôt : 


The px, qx are treated as independent canonical variables. Lagrange was the first to derive the canonical 
equations but he did not recognize them as a basic set of equations of motion. Hamilton derived the canonical 
equations of motion from his fundamental variational principle and made them the basis for a far-reaching 
theory of dynamics. Hamilton’s equations give 2s first-order differential equations for px, qx for each of the 
s degrees of freedom. Lagrange’s equations give s second-order differential equations for the variables qx, qx. 

Routhian reduction technique 

The Routhian reduction technique is a hybrid of Lagrangian and Hamiltonian mechanics that exploits 
the advantages of both approaches for solving problems involving cyclic variables. It is especially useful for 
solving motion in rotating systems in science and engineering. Two Routhians are used frequently for solving 
the equations of motion of rotating systems. Assuming that the variables between 1 <i < s are non-cyclic, 
while the m variables between s + 1 <i < n are ignorable cyclic coordinates, then the two Routhians are: 


m s 
Reyctic(qi, -< 4n; Åi, Mas Qs Ps+1, ---- Pn; t) = 5 pig — L = H — 5 Didi (8.65) 
cyclic noncyclic 
s m 
Rnoncycliclqi, ---» an3 P1, mien Psi Wey -s Ôn; t) = 5 Didi -L=H— 5 Didi (8.68) 


noncyclic cyclic 
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The Routhian Reyclic is a negative Lagrangian for the non-cyclic variables between 1 < i < s, where 
s = n — m, and is a Hamiltonian for the m cyclic variables between s +1 < i < n. Since the cyclic 
variables are constants of the Hamiltonian, their solution is trivial, and the number of variables included in 
the Lagrangian is reduced from n to s = n— m. The Routhian Royerie is useful for solving some problems in 
classical mechanics. The Routhian Rponcyctic 18 a Hamiltonian for the non-cyclic variables between 1 <i < s, 
and is a negative Lagrangian for the m cyclic variables between s+ 1 <i < n. Since the cyclic variables 
are constants of motion, the Routhian Rnoncyctic also is a constant of motion but it does not equal the total 
energy since the coordinate transformation is time dependent. The Routhian Rnoncyclic is especially valuable 
for solving rotating many-body systems such as galaxies, molecules, or nuclei, since the Routhian Ryoncyetic 
is the Hamiltonian in the rotating body-fixed coordinate frame. 

Variable mass systems: 

Two examples of heavy flexible chains falling in a uniform gravitational field were used to illustrate 
how variable mass systems can be handled using Lagrangian and Hamiltonian mechanics. The falling-mass 
system is conservative assuming that both the donor plus the receptor body systems are included. 

Comparison of Lagrangian and Hamiltonian mechanics 

Lagrangian and the Hamiltonian dynamics are two powerful and related variational algebraic formulations 
of mechanics that are based on Hamilton’s action principle. They can be applied to any conservative degrees 
of freedom as discussed in chapters 6,8, and 15. Lagrangian and Hamiltonian mechanics both concentrate 
solely on active forces and can ignore internal forces. They can handle many-body systems and allow 
convenient generalized coordinates of choice. This ability is impractical or impossible using Newtonian 
mechanics. Thus it is natural to compare the relative advantages of these two algebraic formalisms in order 
to decide which should be used for a specific problem. 

For a system with n generalized coordinates, plus m constraint forces that are not required to be known, 
then the Lagrangian approach, using a minimal set of generalized coordinates, reduces to only s =n—m 
second-order differential equations and unknowns compared to the Newtonian approach where there are 
n +m unknowns. Alternatively, use of Lagrange multipliers allows determination of the constraint forces 
resulting in n + m second order equations and unknowns. The Lagrangian potential function is limited 
to conservative forces, Lagrange multipliers can be used to handle holonomic forces of constraint, while 
generalized forces can be used to handle non-conservative and non-holonomic forces. The advantage of the 
Lagrange equations of motion is that they can deal with any type of force, conservative or non-conservative, 
and they directly determine q, q rather than q, p which then requires relating p to q. 

For a system with n generalized coordinates, the Hamiltonian approach determines 2n first-order differ- 
ential equations which are easier to solve than second-order equations. However, the 2n solutions must be 
combined to determine the equations of motion. The Hamiltonian approach is superior to the Lagrange ap- 
proach in its ability to obtain an analytical solution of the integrals of the motion. Hamiltonian dynamics also 
has a means of determining the unknown variables for which the solution assumes a soluble form. Important 
applications of Hamiltonian mechanics are to quantum mechanics and statistical mechanics, where quantum 
analogs of q; and p;, can be used to relate to the fundamental variables of Hamiltonian mechanics. This 
does not apply for the variables q; and q; of Lagrangian mechanics. The Hamiltonian approach is especially 
powerful when the system has m cyclic variables, then the m conjugate momenta p; are constants. Thus the 
m conjugate variables (q;,p;) can be factored out of the Hamiltonian, which reduces the number of conjugate 
variables required to n — m. This is not possible using the Lagrangian approach since, even though the m 
coordinates q; can be factored out, the velocities q; still must be included, thus the n conjugate variables 
must be included. The Lagrange approach is advantageous for obtaining a numerical solution of systems in 
classical mechanics. However, Hamiltonian mechanics expresses the variables in terms of the fundamental 
canonical variables (q, p) which provides a more fundamental insight into the underlying physics.” 


?Recommended reading: "Classical Mechanics" H. Goldstein, Addison-Wesley, Reading (1950). The present chapter 
closely follows the notation used by Goldstein to facilitate cross-referencing and reading the many other textbooks that have 
adopted this notation. 
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Workshop exercises 


1. A block of mass m rests on an inclined plane making an angle 0 with the horizontal. The inclined plane (a 
triangular block of mass M) is free to slide horizontally without friction. The block of mass m is also free to 
slide on the larger block of mass M without friction. 


Construct the Lagrangian function. 
Derive the equations of motion for this system. 


c) Calculate the canonical momenta. 
) Construct the Hamiltonian function. 


Find which of the two momenta found in part (c) is a constant of motion and discuss why it is so. If the 
two blocks start from rest, what is the value of this constant of motion? 


2. Discuss among yourselves the following four conditions that can exist for the Hamiltonian and give several 
examples of systems exhibiting each of the four conditions. 


(a) 
(b) 
(c) 

) 


(d) The Hamiltonian is not conserved and does not equal the mechanical total energy. 


The Hamiltonian is conserved and equals the total mechanical energy 
The Hamiltonian is conserved but does not equal the total mechanical energy 


The Hamiltonian is not conserved but does equal the total mechanical energy 


3. A block of mass m rests on an inclined plane making an angle 0 with the horizontal. The inclined plane (a 
triangular block of mass M) is free to slide horizontally without friction. The block of mass m is also free to 
slide on the larger block of mass M without friction. 


Construct the Lagrangian function. 
Derive the equations of motion for this system. 


c) Calculate the canonical momenta. 
) Construct the Hamiltonian function. 


Find which of the two momenta found in part (c) is a constant of motion and discuss why it is so. If the 
two blocks start from rest, what is the value of this constant of motion? 


4. Discuss among yourselves the following four conditions that can exist for the Hamiltonian and give several 
examples of systems exhibiting each of the four conditions. 


a) The Hamiltonian is conserved and equals the total mechanical energy 
b) The Hamiltonian is conserved but does not equal the total mechanical energy 
c) The Hamiltonian is not conserved but does equal the total mechanical energy 
d) The Hamiltonian is not conserved and does not equal the mechanical total energy 
5. Compare the Lagrangian formalism and the Hamiltonian formalism by creating a two-column chart. Label one 


side “Lagrangian” and the other side “Hamiltonian” and discuss the similarities and differences. Here are some 
ideas to get you started: 


e What are the basic variables in each formalism? 
e What are the form and number of the equations of motion derived in each case? 
e How does the Lagrangian “state space” compare to the Hamiltonian “phase space”? 
6. It can be shown that if L(q, q, t) is the Lagrangian of a particle moving in one dimension, then L = L’ where 


L'(q,q,t) = L(q,q,t) + af and f(q,t) is an arbitrary function. This problem explores the consequences of 
this on the Hamiltonian formalism. 
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(a) Relate the new canonical momentum p’, for L’, to the old canonical momentum p, for L. 
(b) Express the new Hamiltonian H’(q’,p’,t) for L’ in terms of the old Hamiltonian H (q, p, t) and f. 


(c) Explicitly show that the new Hamilton’s equations for H’ are equivalent to the old Hamilton’s equations 


for H. 


7. A massless hoop of radius R is rotating about an axis perpendicular to its central axis at constant angular 


velocity w. A mass m can freely slide around the hoop. 


(a) Determine the Lagrangian of the system. 
(b) Determine the Hamiltonian of the system. Does it equal the total mechanical energy? 


(c) Determine the Lagrangian of the system with respect to a coordinate frame in which H = T+Vog. What 
is Vag? What force generates the additional term in Vog? 


8. Consider a pendulum of length L attached to the end of rod of length R. The rod is rotating at constant 
angular velocity w in the plane. Assume the pendulum is always taut. 


(a) Determine equations of motion. 


(b) For what value of wR is this system the same as a plane pendulum in a constant gravitational field? 


(c) Show H 4 E. What is the reason? 


Problems 


1) A particle of mass M in a gravitational field slides on the inside of a smooth parabola of revolution whose axis is 
vertical. Using the distance from the axis r, and the azimuthal angle ọ as generalized coordinates, find the following. 

a) The Lagrangian of the system. 

b) The generalized momenta and the corresponding Hamiltonian 

c) The equation of motion for the coordinate r as a function of time. 

d) If ge = 0, show that the particle can execute small oscillations about the lowest point of the paraboloid and 
find the frequency of these oscillations. 


2) Consider a particle of mass m which is constrained to move on the surface of a sphere of radius R. There are no 
external forces of any kind acting on the particle. 

a) What is the number of generalized coordinates necessary to describe the problem? 

b) Choose a set of generalized coordinates and write the Lagrangian of the system. 

c) What is the Hamiltonian of the system? Is it conserved? 

d) Prove that the motion of the particle is along a great circle of the sphere. 
3. A block of mass m is attached to a wedge of mass M by a spring with spring constant k. The inclined frictionless 
surface of the wedge makes an angle Q to the horizontal. The wedge is free to slide on a horizontal frictionless surface 
as shown in the figure. 

a) Given that the relaxed length of the spring is d, find the values sy when both book and wedge are stationary. 

b) Find the Lagrangian for the system as a function of the x coordinate of the wedge and the length of spring s. 
Write down the equations of motion. 

c) What is the natural frequency of vibration? 
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4. A fly-ball governor comprises two masses m connected by 4 hinged arms of length / to a vertical shaft and to a 
mass M which can slide up or down the shaft without friction in a uniform vertical gravitational field as shown in 
the figure. The assembly is constrained to rotate around the axis of the vertical shaft with same angular velocity as 
that of the vertical shaft. Neglect the mass of the arms, air friction, and assume that the mass M has a negligible 
moment of inertia. Assume that the whole system is constrained to rotate with a constant angular velocity wo. 

a) Choose suitable coordinates and use the Lagrangian to derive equations of motion of the system around the 
equilibrium position. 

b) Determine the height z of the mass M above its lowest position as a function of wo. 

c) Find the frequency of small oscillations about this steady motion. 

d) Derive a Routhian that provides the Hamiltonian in the rotating system. 

e) Is the total energy of the fly-ball governor in the rotating frame of reference constant in time? 

f) Suppose that the shaft and assembly are not constrained to rotate at a constant angular velocity wo, that is, 
it is allowed to rotate freely at angular velocity p. What is the difference in the overall motion? 


Z 


5. A rigid straight, frictionless, massless, rod rotates about the z axis at an angular velocity 0. A mass m slides 
along the frictionless rod and is attached to the rod by a massless spring of spring constant K. 

a; Derive the Lagrangian and the Hamiltonian 

b; Derive the equations of motion in the stationary frame using Hamiltonian mechanics. 

c; What are the constants of motion? 

d; If the rotation is constrained to have a constant angular velocity 0 = w then is the non-cyclic Routhian 
Rnoncyelic =H — pod a constant of motion, and does it equal the total energy? 

e; Use the non-cyclic Routhian Rnoncyclic to derive the radial equation of motion in the rotating frame of reference 


for the cranked system with 0 = w. 


224 CHAPTER 8. HAMILTONIAN MECHANICS 


6. A thin uniform rod of length 2L and mass M is suspended from a massless string of length l tied to a nail. Initially 
the rod hangs vertically. A weak horizontal force F is applied to the rod’s free end. 

a) Write the Lagrangian for this system. 

b) For very short times such that all angles are small, determine the angles that string and the rod make with 
the vertical. Start from rest at t = 0. 

c) Draw a diagram to illustrate the initial motion of the rod. 


7. A uniform ladder of mass M and length 2L is leaning against a frictionless vertical wall with its feet on a 
frictionless horizontal floor. Initially the stationary ladder is released at an angle 04 = 60° to the floor. Assume 
that gravitation field g = 9.81m/ s? acts vertically downward and that the moment of inertia of the ladder about its 
midpoint is J = 3ML?. 

a) Derive the Lagrangian 

b) Derive the Hamiltonian 

c) Explain if the Hamiltonian is conserved and/or if it equals the total energy 

d) Use the Lagrangian to derive the equations of motion 

e) Derive the angle 0 at which the ladder loses contact with the vertical wall? 


8. The classical mechanics exam induces Jacob to try his hand at bungee jumping. Assume Jacob's mass m 
is suspended in a gravitational field by the bungee of unstretched length b and spring constant k. Besides the 
longitudinal oscillations due to the bungee jump, Jacob also swings with plane pendulum motion in a vertical plane. 
Use polar coordinates r, @, neglect air drag, and assume that the bungee always is under tension. 

a; Derive the Lagrangian 

b; Determine Lagrange's equation of motion for angular motion and identify by name the forces contributing to 
the angular motion. 

c; Determine Lagrange's equation of motion for radial oscillation and identify by name the forces contributing to 
the tension in the spring. 

d; Derive the generalized momenta 

e; Determine the Hamiltonian and give all of Hamilton's equations of motion. 


Chapter 9 


Hamilton’s Action Principle 


9.1 Introduction 


Hamilton’s principle of stationary action was introduced in two papers published by Hamilton in 1834 and 
1835. As mentioned in the Prologue, Hamilton’s Action Principle is the foundation of the hierarchy of three 
philosophical stages that are used in applying analytical mechanics. The first stage is to use Hamilton’s 
Action Principle to derive either the Hamiltonian and Lagrangian for the system. The second stage is to use 
either Lagrangian mechanics, or Hamiltonian mechanics, to derive the equations of motion for the system. 
The third stage is to solve these equations of motion for the assumed initial conditions. Lagrange had 
pioneered Lagrangian mechanics in 1788 based on d’Alembert’s Principle. Hamilton’s Action Principle now 
underlies theoretical physics, and many other disciplines in mathematics and economics. In 1834 Hamilton 
was seeking a theory of optics when he developed both his Action Principle, and the field of Hamiltonian 
mechanics. 

Hamilton’s Action Principle is based on defining the action functional! S for n generalized coor- 
dinates which are expressed by the vector q, and their corresponding velocity vector q. 


S= f $ L(q, 4,t)dt (9.1) 


i 


The scalar action S, is a functional of the Lagrangian L(q, 4,t), integrated between an initial time t; and 
final time tf. In principle, higher order time derivatives of the generalized coordinates could be included, but 
most systems in classical mechanics are described adequately by including only the generalized coordinates, 
plus their velocities. The definition of the action functional allows for more general Lagrangians than the 
simple Standard Lagrangian L(q,q,t) = T(q,t) — U(q,t) that has been used throughout chapters 5 — 8. 
Hamilton stated that the actual trajectory of a mechanical system is that given by requiring that the action 
functional is stationary with respect to change of the variables. The action functional is stationary when the 
variational principle can be written in terms of a virtual infinitessimal displacement, 6, to be 


ty 
5S = sf L(q, 4,t)dt = 0 (9.2) 
ti 


Typically the stationary point corresponds to a minimum of the action functional. Applying variational 
calculus to the action functional leads to the same Lagrange equations of motion for systems as the equations 
derived using d’Alembert’s Principle, if the additional generalized force terms, Ny; MeSH (q, t) Ore Es 
are omitted in the corresponding equations of motion. 

These are used to derive the equations of motion, which then are solved for an assumed set of ini- 
tial conditions. Prior to Hamilton's Action Principle, Lagrange developed Lagrangian mechanics based on 
d’Alembert’s Principle while the Newtonian equations of motion are defined in terms of Newton’s Laws of 
Motion. 


The term "action functional" was named "Hamilton’s Principal Function" in older texts. The name usually is abbreviated 
to "action" in modern mechanics. 
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9.2 Hamilton’s Principle of Stationary Action 


Hamilton’s crowning achievement was his use of the general form of 
Hamilton’s principle of stationary action S, equation 9.2, to derive 
both Lagrangian mechanics, and Hamiltonian mechanics. Consider 9; 
the action Sy for the extremum path of a system in configuration 
space, that is, along path A for j = 1,2,...,n coordinates q;(t;) at 
initial time t; to q;(tf) at a final time tf as shown in figure 9.1. 
Then the action Sa is given by 


re J ” L(q(t), 4(t).t)at (9.3) 


i 


As used in chapter 5.2, a family of neighboring paths is defined 
by adding an infinitessimal fraction e of a continuous, well-behaved 
neighboring function 1; where e = 0 for the extremum path. That 
is, 9; 

q(t, €) = q; (t, 0) + en; (t) (9.4) ¢ 


In contrast to the variational case discussed when deriving La- 
grangian mechanics, the variational path used here does not assume 
that the functions 7,(¢) vanish at the end points. Assume that the 
neighboring path B has an action Sg where 


Figure 9.1: Extremum path A, plus 
ie the neighboring path B, shown in con 
Spg = L(q(t)+6q(t), 4(t)+6q(t))dt 9.5 : ; 
a= f a HA +Ea(0) a(0) +8410) ena 
Expanding the integrand of Sg in equation 9.5 gives that, relative to the extremum path A, the incremental 
change in action is 


tr OL OL 
ôS = Sg- S = (Zsu + 30d) at + LAH; 9.6 
B A i 2 09; dj 0d; qj [ li ( ) 
The second term in the integral can be integrated by parts since dq; = d (e) leading to 
ty 

ts ƏL d OL OL 
ss | Cara ¡de + =~ ôq; + LAt 9.7 
ti 2 09; dt 0d; di 2 0d; dj ( ) 


a 


Note that equation 9.7 includes contributions from the entire path of the integral as well as the variations 
at the ends of the curve and the At terms. Equation 9.7 leads to the following two pioneering principles of 
least action in variational mechanics that were developed by Hamilton. 


9.2.1 Stationary-action principle in Lagrangian mechanics 


Derivation of Lagrangian mechanics in chapter 6 was based on the extremum path for neighboring paths 
between two given locations q(t;) and q(t) that the system occupies at the initial and final times t; and tf 
respectively. For the special case, where the end points do not vary, that is, when 0q,(t,) = dqi(ty) = 0, and 
At; = At; = 0, then the least action 6S for the stationary path (9.8) reduces to 


a OL d OL 
ss | (Z-as dt = 0 9.8 
ti 2 09; dt 0d; di ( ) 


For independent generalized coordinates 6q;, the integrand in brackets vanishes leading to the Euler-Lagrange 
equations. Conversely, if the Euler-Lagrange equations in 9.8 are satisfied, then, 95 = 0, that is, the path 
is stationary. This leads to the statement that the path in configuration space between two configurations 
q(t;) and q(ts) that the system occupies at times t; and ty respectively, is that for which the action S is 
stationary. This is a statement of Hamilton’s Principle. 
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9.2.2 Stationary-action principle in Hamiltonian mechanics 


Hamilton used the general variation of the least-action path to derive of the basic equations of Hamiltonian 
mechanics. For the general path, the integral term in equation 9.7 vanishes because the Euler-Lagrange 
equations are obeyed for the stationary path. Thus the only remaining non-zero contributions are due to 
the end point terms, which can be written by defining the total variation of each end point to be 


Aq; = ôq; + qj At (9.9) 
where ôq; and q; are evaluated at t; and tf. Then equation 9.7 reduces to 
ty ts 
Da +LAt| = Dae + Dat +L) At (9.10) 
ti ti 


Since the generalized momentum p; = Be, then equation 9.10 can be expressed in terms of the Hamiltonian 
J 
and generalized momentum as 


ôS = |S > pj;Aq;-HAt] = [p-Aq—- HAt] (9.11) 
j ts 
as OL 
A ae? ee 9.12 
ðq; Od; Pj ( ) 


Equation 9.11 contains Hamilton’s Principle of Least-action. Equation 9.12 gives an alternative relation of 
the generalized momentum p; that is expressed in terms of the action functional S. Note that equations 
9.11 and 9.12, were derived directly without invoking reference to the Lagrangian. 

Integrating the action ôS, equation 9.10, between the end points gives the action for the path between 
t = t; andt=tr, that is, S(qj(t i) ti, aj(tp), to) to be 


f 
The stationary path is obtained by using the variational principle 
f 
S= f [p-4—Hlap.)]di=0 (9.14) 


The integrand, J = [p -å — H(q,p,t)], in this modified Hamilton’s principle, can be used in the n Euler- 
Lagrange equations for 7 = 1,2,3,...,n to give 


d (Ol Ol OH 
=== a == 91 dE) 9.15 
(3%) dy 4" 8a; ve 
Similarly, the other n Euler-Lagrange equations give 
d fol Ol OH 
: =-qt =0 9.16 
( Op; Op; Op; we 


Thus Hamilton’s principle of least-action leads to Hamilton’s equations of motion, that is equations 9.15, 
and 9.16. 
The total time derivative of the action S, which is a function of the coordinates and time, is 


== A 
25 >. a pd (9.17) 


But the total time derivative of equation 9.14 equals 


ds 


a PA H(q, p,t) (9.18) 
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Combining equations 9.17 and 9.18 gives the Hamilton-Jacobi equation which is discussed in chapter 15.4. 


a H(q,p,t) =0 (9.19) 
ot 
In summary, Hamilton’s principle of least action leads directly to Hamilton’s equations of motion (9.15, 9.16) 
plus the Hamilton-Jacobi equation (9.19). Note that the above discussion has derived both Hamilton’s Ac- 
tion Principle (9.8), and Hamilton’s equations of motion (9.15, 9.16), directly from Hamilton’s variational 
concept of stationary action, S, without explicitly invoking the Lagrangian. 


9.2.3 Abbreviated action 


Hamilton’s Action Principle determines completely the path of the motion and the position on the path as 
a function of time. If the Lagrangian and the Hamiltonian are time independent, that is, conservative, then 
H = E and equation 9.13 equals 


f f 
Salty) ta) = f ipa-Blar= f p:9q — E(t — ti) (9.20) 


The E Pp : 04 term in equation 9.20, is called the abbreviated action which is defined as 


f $ 
s= f põġdi= f p-ôq (9.21) 


The abbreviated action can be simplified assuming use of the standard Lagrangian L = T — U with a 
velocity-independent potential U, then equation 8.4 gives. 


tp n ty tr tf 
v= O rrásde= f (L+H) a= a= f p-ôq (9.22) 
ti j ti ti ti 


Abbreviated action provides for use of a simplified form of the principle of least action that is based 
on the kinetic energy, and not potential energy. For conservative systems it determines the path of the 
motion, but not the time dependence of the motion. Consider virtual motions where the path satisfies 
energy conservation, and where the end points are held fixed, that is ôq; = 0, but allow for a variation ôt in 
the final time. Then using the Hamilton-Jacobi equation, 9.19 


ôS = —Hét = —Eót (9.23) 
However, equation 9.21 gives that 
ôS = 959 — Edt (9.24) 
Therefore 
6So = 0 (9.25) 


That is, the abbreviated action has a minimum with respect to all paths that satisfy the conservation of 
energy which can be written as 


tf 
ar f 2Tdt = 0 (9.26) 
ti 


Equation 9.26 is called the Maupertuis’ least-action principle which he proposed in 1744 based on Fermat’s 
Principle in optics. Credit for the formulation of least action commonly is given to Maupertuis; however, the 
Maupertuis principle is similar to the use of least action applied to the “vis viva”, as was proposed by Leibniz 
four decades earlier. Maupertuis used teleological arguments, rather than scientific rigor, because of his 
limited mathematical capabilities. In 1744 Euler provided a scientifically rigorous argument, presented above, 
that underlies the Maupertuis principle. Euler derived the correct variational relation for the abbreviated 


action to be i 
f n 
6So = f 5 pjôq; =0 (9.27) 
to 5 


Hamilton’s use of the principle of least action to derive both Lagrangian and Hamiltonian mechanics is 
a remarkable accomplishment. It underlies both Lagrangian and Hamiltonian mechanics and confirmed the 
conjecture of Maupertuis. 
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9.2.4 Hamilton's Principle applied using initial boundary conditions 


Galley[Gal13] identified a subtle inconsistency in the appli- 
cations of Hamilton's Principle of Stationary Action to both 
Lagrangian and Hamiltonian mechanics. The inconsistency 
involves the fact that Hamilton's Principle is defined as the 
action integral between the initial time t; and the final time 
tf as boundary conditions, that is, it is assumed to be time 
symmetric. However, most applications in Lagrangian and 
Hamiltonian mechanics assume that the action integral is 
evaluated based on the initial values as the boundary condi- 
tions, rather than the initial t; and final times tf. That 
is, typical applications require use of a time-asymmetric 
version of Hamilton’s principle. Galley[Gal13][Gal14] pro- 
posed a framework for transforming Hamilton’s Principle to 
a time-asymmetric form in order to handle problems where 
the boundary conditions are based on using only the ini- 
tial values at the initial time t;, rather than the initial plus 
final times (t;,t) that is assumed in the time-symmetric 
definition of the action in Hamilton’s Principle. 

The following describes the framework proposed by 
Galley for transforming Hamilton’s Principle to a time- 
asymmetric form. Let q and q designate sets of N gener- 
alized coordinates, plus their velocities, where q and q are 
the fundamental variables assumed in the definition of the 
Lagrangian used by Hamilton’s Principle. As illustrated 
schematically in figure 9.2, Galley proposed doubling the 


Figure 9.2: The left schematic shows paths be- 
tween the initial q(t;) and final q(tf) times 
for conservative mechanics. The solid line des- 
ignates the path for which the action is sta- 
tionary, while the dashed lines represent the 
varied paths. The right schematic shows the 
paths applied to the doubled degrees of free- 
dom with two initial boundary conditions, that 
is, qi(t;) and qo(t;) plus assuming that both 
paths are identical at their intersection and 
that they intersect at the same final time, that 


is, qi(ty) = qa(t y). 


number of degrees of freedom for the system considered, that is, let q > (q1, q2) and q > (41,42). In ad- 
dition he defines two identical variational paths 1 and 2, where path 2 is the time reverse of path 1. That 
is, path 1 starts at the initial time t;, and ends at ty, whereas path 2 starts at ty and ends at t;. That 
is, he assumes that q and q specify the two paths in the space of the doubled degrees of freedom that are 
identical, and that they intersect at the final time ty. The arrows shown on the paths in figure 9.2 designate 
the assumed direction of the time integration along these paths. 

For the doubled system of degrees of freedom, the total action for the sum of the two paths is given by 
the time integral of the doubled variables, S(q1,q2) which can be written as 


ty ta ty 
S (qi, q2) =/ L (qi, 41,1) a | L (q2, 42, t) a= f [L (qi, 41,1) dt — L (q2, 42,1)] dt (9.28) 
$ 


i ty ti 

The above relation assumes that the doubled variables (qi, q1) and (q2, 42) are decoupled from each other. 
More generally one can assume that the two sets of variables are coupled by some arbitrary function 
K (qi, 41, q2, Å2, t). Then the action can be written as 


ty 

S (qi, 2) =| [L (qi, 41,4) dt — L (q2, qo, t) + K (q1, 41, q2, de, t)] dt (9.29) 

ti 

The effective Lagrangian for this doubled system then can be defined as 

A (qi, q2, qi, Qe, t) = [L (qı, åı,t) dt ~ L (a2, å2,t) T K (qı, qi, q2, Qe, t)] (9.30) 

and the action can be written as 

ty 
S (qi, 92) ==) A (qi, 41, G2, 42, t) dt (9.31) 
ti 


The coupling term K (qi, 41,42, Å2, t) for the doubled system of degrees of freedom must satisfy the 
following two properties. 
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(a) If it can be expressed as the difference of two scalar potentials, AU (q1,q2) = U (qi) — U (q2), then 
it can be absorbed into the potential term for each of the doubled variables in the Lagrangian. This implies 
that K = 0, and there is no reason to double the number of degrees of freedom because the system is 
conservative. Thus K describes generalized forces that are not derivable from potential energy, that is, not 
conservative. 

(b) A second property of the coupling term K (q1, 41, q2, Å2, t) is that it must be antisymmetric under 
interchange of the arbitrary labels 1 > 2. That is, 


K (q2, 42, 91, 41, t) =-K (qi, 41, q2, Å2, t) (9.32) 


Therefore the antisymmetric function K (qi, 1, q2, Å2, t) vanishes when q2 = qi. 
The variational condition requires that the action S (q1,q2) has a well defined stationary point for the 
doubled system. This is achieved by parametrizing both coordinate paths as 


q1,2(t, €) = q1,2(t,0) + em 2(t) (9.33) 


where q1,2(t, 0) are the coordinates for which the action is stationary, e < 1. and where 7, 2(t) are arbitrary 
functions of time denoting virtual displacements of the paths. The doubled system has two independent 
paths connecting the two initial boundary conditions at t;, and it requires that these paths intersect at ty. 
The variational system for the two intersecting paths requires specifying four conditions, two per path. Two 
of the four conditions are determined by requiring that at t; the initial boundary conditions satisfies that 
nı 2(t;) = 0. The remaining two conditions are derived by requiring that the variation of the action S (q1, q2) 
satisfies 


dS dd OA dr, OA dr 
tee =0= d - 9.34 
| de | e=0 i J € (m E dt e=0 de E dt = - Imm MM arty ( ) 


The canonical momenta 71,2 conjugate to the doubled coordinates qı1,2 are defined using the nonconser- 
vative Lagrangian A to be 


On z OL (qi, 41,1) Ok (q1, 41, G2, 42, t) 
dq; (t) Ogi (t) Og; (t) 


Ti (q1,2, 41.2) = (9.35) 


where the superscript J designates the solution based on the initial conditions. Note that the conjugate 

Haat while the ohara et) term is part of the total momentum due to the 
1 1 

nonconservative interaction. Similarly the momentum for the second path is 


. JA OL (q1,41,t) , OK (qi, 41, 92, 92, t) 
I = ) > > > ’ ’ 
T3 (41,2; 41,2) = = a Aa (9.36) 
° 043 (t) ða (t) ða (t) 
The last term in equation 9.34, that is, the term [7,71 — naT2l,-+, results from integration by parts, 
which will vanish if 


momentum pi = 


mi(tr)mi (te) = 3 (ty) (ty) (9.37) 
The equality condition at the intersection of the two paths at tf; requires that 
nils) = nalts) (9.38) 


Therefore equations 9.37 and 9.38 imply that 


Tilts) = malts) (9.39) 


Therefore equations 9.38 and 9.39 constitute the equality condition that must be satisfied when the two 
paths intersect at tf. The equality condition ensures that the boundary term for integration by parts in 
equation 9.34 will vanish for arbitrary variations provided that the two unspecified paths agree at the final 
time tf. Similarly the conjugate momenta mi (ts), mġ(tf) must agree, but otherwise are unspecified. As a 
consequence, the equality condition ensures that the variational principle is consistent with the final state at 
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ty not being specified. That is, the equations of motion are only specified by the initial boundary conditions 
of the time asymmetric action for the doubled system. 

More physics insight is provided by using a more convenient parametrization of the coordinates in terms 
of their average and difference. That is, let 


Le 
q +4 
=> =g- (9.40) 
Then the physical limit is 
q} > q! ql — 0 (9.41) 


That is, the average history is the relevant physical history, while the difference coordinate simply vanishes. 
For these coordinates, the nonconservative Lagrangian is A (q+,q-,4+,4-,t) and the equality conditions 
reduce to 


0 (9.43) 


which implies that the physically relevant average (+) quantities are not specified at the final time ty in 
order to have a well-defined variational principle. 
The canonical momenta are given by 


I I 
I Ti +7 OA 
OA 
I I 
= = T3 = =F 9.45 
TT Ti — Ta dq, ( ) 
The equations of motion can be written as. 
d OA OA 
ca eis e 9.46 
dt Og Ogi ( ) 


Equation 9.46 is identically zero for the + subscript, while, in the physical limit (PL), the negative subscript 


gives that 
É ðA A 


dt Ogu 7 Og 
Substituting for the Lagrangian A gives that 


d OL OL Ee d ok 


dol ol lol dol 


IM =0 (9.47) 


] laa (9.48) 
PL 


where Q! is a generalized nonconservative force derived from K. 
Note that equation 9.46 can be derived equally well by taking the direct functional derivative with respect 
to qi (t), that is, 
ôS 


o= n] T (9.49) 


The above time-asymmetric formalism applies Hamilton’s action principle to systems that involve initial 
boundary conditions while the second path corresponds to the final boundary conditions. This framework, 
proposed recently by Galley[Gal13], provides a remarkable advance for the handling of nonconservative action 
in Lagrangian and Hamiltonian mechanics.? This formalism directly incorporates the variational principle 
for initial boundary conditions and causal dynamics that are usually required for applications of Lagrangian 
and Hamiltonian mechanics. Currently, there is limited exploitation of this new formalism because there 
has been insufficient time for it to become well known, for full recognition of its importance, and for the 
development and publication of applications. Chapter 10 discusses an application of this formalism to 
nonconservative systems in classical mechanics. 


2 This topic goes beyond the planned scope of this book. It is recommended that the reader refer to the work of Galley, 
Tsang, and Stein[Gal13, Gal14] for further discussion plus examples of applying this formalism to nonconservative systems in 
classical mechanics, electromagnetic radiation, RLC circuits, fluid dynamics, and field theory. 
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93 Lagrangian 


9.3.1 Standard Lagrangian 


Lagrangian mechanics, as introduced in chapter 6, was based on the concepts of kinetic energy and potential 
energy. d’Alembert’s principle of virtual work was used to derive Lagrangian mechanics in chapter 6 and this 
led to the definition of the standard Lagrangian. That is, the standard Lagrangian was defined in chapter 
6.2 to be the difference between the kinetic and potential energies. 


Hamilton extended Lagrangian mechanics by defining Hamilton’s Principle, equation 9.2, which states that 
a dynamical system follows a path for which the action functional is stationary, that is, the time integral 
of the Lagrangian. Chapter 6 showed that using the standard Lagrangian for defining the action functional 
leads to the Euler-Lagrange variational equations 


m 


d (aL) _ ƏL) _ gero , $ 0m 
ta (5) Ta +2 Ay (01) ci 


k=1 


The Lagrange multiplier terms handle the holonomic constraint forces and QA C handles the remaining 
excluded generalized forces. Chapters 6 — 8 showed that the use of the standard Lagrangian, with the Euler- 
Lagrange equations (9.51), provides a remarkably powerful and flexible way to derive second-order equations 
of motion for dynamical systems in classical mechanics. 

Note that the Euler-Lagrange equations, expressed solely in terms of the standard Lagrangian (9.51), 
that is, excluding the Gr CH ar32 (a, t) terms, are valid only under the following conditions: 


1. The forces acting on the system, apart from any forces of constraint, must be derivable from scalar 
potentials. 


2. The equations of constraint must be relations that connect the coordinates of the particles and may 
be functions of time, that is, the constraints are holonomic. 


The Q7* ESDH Mza, t) terms extend the range of validity of using the standard Lagrangian in the 
Lagrange-Euler equations by introducing constraint and omitted forces explicitly. 

Chapters 6— 8 exploited Lagrangian mechanics based on use of the standard definition of the Lagrangian. 
The present chapter will show that the powerful Lagrangian formulation, using the standard Lagrangian, 
can be extended to include alternative non-standard Lagrangians that may be applied to dynamical systems 
where use of the standard definition of the Lagrangian is inapplicable. If these non-standard Lagrangians 
satisfy Hamilton's Action Principle, 9.2, then they can be used with the Euler-Lagrange equations to generate 
the correct equations of motion, even though the Lagrangian may not have the simple relation to the kinetic 
and potential energies adopted by the standard Lagrangian. Currently, the development and exploitation of 
non-standard Lagrangians is an active field of Lagrangian mechanics. 


9.3.2 Gauge invariance of the standard Lagrangian 


Note that the standard Lagrangian is not unique in that there is a continuous spectrum of equivalent 
standard Lagrangians that all lead to identical equations of motion. This is because the Lagrangian L is a 
scalar quantity that is invariant with respect to coordinate transformations. The following transformations 
change the standard Lagrangian, but leave the equations of motion unchanged. 


1. The Lagrangian is indefinite with respect to addition of a constant to the scalar potential which cancels 
out when the derivatives in the Euler-Lagrange differential equations are applied. 


2. The Lagrangian is indefinite with respect to addition of a constant kinetic energy. 


3. The Lagrangian is indefinite with respect to addition of a total time derivative of the form La = 
Li + £[A(q,t)], for any differentiable function A(q;t) of the generalized coordinates plus time, that 
has continuous second derivatives. 
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This last statement can be proved by considering a transformation between two related standard La- 
grangians of the form 


L2(q, q, t) = Lı (q, q, t) T 


BD nagn + (A, Aa 


dq P at 
This leads to a standard Lagrangian La that has the same equations of motion as Lı as is shown by 
substituting equation 9.52 into the Euler-Lagrange equations. That is, 


d (3l) da _d (aly) dls, PND PAAD d (2) Ə gay 
dt 04; 09; pa dt 0d; 09; 0t0q; 0t0q; o dt 0d; i 


(9.52) 


Thus even though the related Lagrangians Lı and Lə are different, they are completely equivalent in that 
they generate identical equations of motion. 

There is an unlimited range of equivalent standard Lagrangians that all lead to the same equations of 
motion and satisfy the requirements of the Lagrangian. That is, there is no unique choice among the wide 
range of equivalent standard Lagrangians expressed in terms of generalized coordinates. This discussion is 
an example of gauge invariance in physics. 

Modern theories in physics describe reality in terms of potential fields. Gauge invariance, which also is 
called gauge symmetry, is a property of field theory for which different underlying fields lead to identical 
observable quantities. Well-known examples are the static electric potential field and the gravitational 
potential field where any arbitrary constant can be added to these scalar potentials with zero impact on the 
observed static electric field or the observed gravitational field. Gauge theories constrain the laws of physics 
in that the impact of gauge transformations must cancel out when expressed in terms of the observables. 
Gauge symmetry plays a crucial role in both classical and quantal manifestations of field theory, e.g. it is 
the basis of the Standard Model of electroweak and strong interactions. 

Equivalent Lagrangians are a clear manifestation of gauge invariance as illustrated by equations 9.52, 9.53 
which show that adding any total time derivative of a scalar function A(q,t) to the Lagrangian has no 
observable consequences on the equations of motion. That is, although addition of the total time derivative 
of the scalar function A(q, t) changes the value of the Lagrangian, it does not change the equations of motion 
for the observables derived using equivalent standard Lagrangians. 

For Lagrangian formulations of classical mechanics, the gauge invariance is readily apparent by direct 
inspection of the Lagrangian. 


9.1 Example: Gauge invariance in electromagnetism 


The scalar electric potential ® and the vector potential A fields in electromagnetism are examples of gauge- 
invariant fields. These electromagnetic-potential fields are not directly observable, that is, the electromagnetic 
observable quantities are the electric field E and magnetic field B which can be derived from the scalar and 
vector potential fields ® and A. An advantage of using the potential fields is that they reduce the problem 
from 6 components, 3 each for E and B, to 4 components, one for the scalar field ® and 3 for the vector 
potential A. The Lagrangian for the velocity-dependent Lorentz force, given by equation 6.67, provides an 
example of gauge invariance. Equations 6.63 and 6.65 showed that the electric and magnetic fields can be 
expressed in terms of scalar and vector potentials $ and A by the relations 


B=VxA 
OA 
E=-V0-— 
ot 


The equations of motion for a charge q in an electromagnetic field can be obtained by using the Lagrangian 
1 
L= 5mv-v—q(@—-A-v) 


Consider the transformations (A,®) — (A’, ®’) in the transformed Lagrangian L'.where 


A' =A+VA(r,t) 
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The transformed Lorentz-force Lagrangian L' is related to the original Lorentz-force Lagrangian L by 


; ðA (r,t) d 
am ; | = = 
L = L+q |i VArt) + BE L+ agt») 


Note that the additive term q4Al(r,t) is an exact time differential. Thus the Lagrangian L' is gauge invariant 
implying identical equations of motion are obtained using either of these equivalent Lagrangians. 
The force fields E and B can be used to show that the above transformation is gauge-invariant. That is, 


OA’ OA 


E = o’ = ® =E 
~ ot y ot 


B'=VxA'=VxA=B 


That is, the additive terms due to the scalar field A(r,t) cancel. Thus the electromagnetic force fields following 
a gauge-invariant transformation are shown to be identical in agreement with what is inferred directly by 
inspection of the Lagrangian. 


9.3.3 Non-standard Lagrangians 


The definition of the standard Lagrangian was based on d'Alembert's differential variational principle. The 
flexibility and power of Lagrangian mechanics can be extended to a broader range of dynamical systems 
by employing an extended definition of the Lagrangian that is based on Hamilton's Principle, equation 9.2. 
Note that Hamilton's Principle was introduced 46 years after development of the standard formulation of 
Lagrangian mechanics. Hamilton's Principle provides a general definition of the Lagrangian that applies 
to standard Lagrangians, which are expressed as the difference between the kinetic and potential energies, 
as well as to non-standard Lagrangians where there may be no clear separation into kinetic and potential 
energy terms. These non-standard Lagrangians can be used with the Euler-Lagrange equations to generate 
the correct equations of motion, even though they may have no relation to the kinetic and potential energies. 
The extended definition of the Lagrangian based on Hamilton's action functional 9.1 can be exploited for 
developing non-standard definitions of the Lagrangian that may be applied to dynamical systems where use 
of the standard definition is inapplicable. Non-standard Lagrangians can be equally as useful as the standard 
Lagrangian for deriving equations of motion for a system. Secondly, non-standard Lagrangians, that have no 
energy interpretation, are available for deriving the equations of motion for many nonconservative systems. 
Thirdly, Lagrangians are useful irrespective of how they were derived. For example, they can be used to 
derive conservation laws or the equations of motion. Coordinate transformations of the Lagrangian is much 
simpler than that required for transforming the equations of motion. The relativistic Lagrangian defined in 
chapter 17.6 is a well-known example of a non-standard Lagrangian. 


9.3.4 Inverse variational calculus 


Non-standard Lagrangians and Hamiltonians are not based on the concept of kinetic and potential energies. 
Therefore, development of non-standard Lagrangians and Hamiltonians require an alternative approach that 
ensures that they satisfy Hamilton’s Principle, equation 9.2, which underlies the Lagrangian and Hamil- 
tonian formulations. One useful alternative approach is to derive the Lagrangian or Hamiltonian via an 
inverse variational process based on the assumption that the equations of motion are known. Helmholtz de- 
veloped the field of inverse variational calculus which plays an important role in development of non-standard 
Lagrangians. An example of this approach is use of the well-known Lorentz force as the basis for deriving 
a corresponding Lagrangian to handle systems involving electromagnetic forces. Inverse variational calculus 
is a branch of mathematics that is beyond the scope of this textbook. The Douglas theorem[Dou41] states 
that, if the three Helmholtz conditions are satisfied, then there exists a Lagrangian that, when used with the 
Euler-Lagrange differential equations, leads to the given set of equations of motion. Thus, it will be assumed 
that the inverse variational calculus technique can be used to derive a Lagrangian from known equations of 
motion. 
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9.4 Application of Hamilton’s Action Principle to mechanics 


Knowledge of the equations of motion is required to predict the response of a system to any set of initial 
conditions. Hamilton’s action principle, that is built into Lagrangian and Hamiltonian mechanics, coupled 
with the availability of a wide arsenal of variational principles and techniques, provides a remarkably powerful 
and broad approach to deriving the equations of motions required to determine the system response. 

As mentioned in the Prologue, derivation of the equations of motion for any system, based on Hamilton’s 
Action Principle, separates naturally into a hierarchical set of three stages that differ in both sophistication 
and understanding, as described below. 


1. Action stage: The primary “action stage” employs Hamilton’s Action functional, S = fi È ? L(q, 4,t)dt 
to derive the Lagrangian and Hamiltonian functionals. This action stage provides the most fundamental 
and sophisticated level of understanding. It involves specifying all the active degrees of freedom, as 
well as the interactions involved. Symmetries incorporated at this primary action stage can simplify 
subsequent use of the Hamiltonian and Lagrangian functionals. 


2. Hamiltonian/Lagrangian stage: The “Hamiltonian/Lagrangian stage” uses the Lagrangian or 
Hamiltonian functionals, that were derived at the action stage, in order to derive the equations of 
motion for the system of interest. Symmetries, not already incorporated at the primary action stage, 
may be included at this secondary stage. 


3. Equations of motion stage: The “equations-of-motion stage” uses the derived equations of motion to 
solve for the motion of the system subject to a given set of initial boundary conditions. Nonconservative 
forces, such as dissipative forces, that were not included at the primary and secondary stages, may be 
added at the equations of motion stage. 


Lagrange omitted the action stage when he used d’Alembert’s Principle to derive Lagrangian mechanics. 
The Newtonian mechanics approach omits both the primary “action” stage, as well as the secondary “Hamil- 
tonian/Lagrangian” stage, since Newton’s Laws of Motion directly specify the “equations-of-motion stage”. 
Thus these did not allow exploiting the considerable advantages provided by use of action, the Lagrangian, 
and the Hamiltonian. Newtonian mechanics requires that all the active forces be included when deriving the 
equations of motion, which involves dealing with vector quantities. In Newtonian mechanics, symmetries 
must be incorporated directly at the equations of motion stage, which is more difficult than when done at 
the primary “action” stage, or the secondary “Lagrangian/Hamiltonian” stage. The “action” and “Hamil- 
tonian/Lagrangian” stages allow for use of the powerful arsenal of mathematical techniques that have been 
developed for applying variational principles. 

There are considerable advantages to deriving the equations of motion based on Hamilton’s Principle, 
rather than derive them using Newtonian mechanics. It is significantly easier to use variational principles to 
handle the scalar functionals, action, Lagrangian, and Hamiltonian, rather than starting at the equations- 
of-motion stage. For example, utilizing all three stages of algebraic mechanics facilitates accommodating 
extra degrees of freedom, symmetries, and interactions. The symmetries identified by Noether’s theorem are 
more easily recognized during the primary “action” and secondary “Hamiltonian/Lagrangian” stages rather 
than at the subsequent “equations of motion” stage. Approximations made at the “action” stage are easier 
to implement than at the “equations-of-motion” stage. Constrained motion is much more easily handled at 
the primary “action”, or secondary “Hamilton/Lagrangian” stages, than at the equations-of-motion stage. 
An important advantage of using Hamilton’s Action Principle, is that there is a close relationship between 
action in classical and quantal mechanics, as discussed in chapters 15 and 18. Algebraic principles, that 
underly analytical mechanics, naturally encompass applications to many branches of modern physics, such 
as relativistic mechanics, fluid motion, and field theory. 

In summary, the use of the single fundamental invariant quantity, action, as described above, provides a 
powerful and elegant framework, that was developed first for classical mechanics, but now is exploited in a 
wide range of science, engineering, and economics. An important feature of using the algebraic approach to 
classical mechanics is the tremendous arsenal of powerful mathematical techniques that have been developed 
for use of variational calculus applied to Lagrangian and Hamiltonian mechanics. Some of these variational 
techniques were presented in chapters 6, 7,8, and 9, while others will be introduced in chapter 15. 
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9.5 Summary 


The Hamilton’s 1834 publication, introducing both Hamilton’s Principle of Stationary Action and Hamil- 
tonian mechanics, marked the crowning achievements for the development of variational principles in classical 
mechanics. A fundamental advantage of Hamiltonian mechanics is that it uses the conjugate coordinates 
q, p, plus time t, which is a considerable advantage in most branches of physics and engineering. Compared 
to Lagrangian mechanics, Hamiltonian mechanics has a significantly broader arsenal of powerful techniques 
that can be exploited to obtain an analytical solution of the integrals of the motion for complicated sys- 
tems, as described in chapter 15. In addition, Hamiltonian dynamics provides a means of determining the 
unknown variables for which the solution assumes a soluble form, and is ideal for study of the fundamen- 
tal underlying physics in applications to fields such as quantum or statistical physics. As a consequence, 
Hamiltonian mechanics has become the preeminent variational approach used in modern physics. 

This chapter has introduced and discussed Hamilton’s Principle of Stationary Action, which underlies 
the elegant and remarkably powerful Lagrangian and Hamiltonian representations of algebraic mechanics. 
The basic concepts employed in algebraic mechanics are summarized below. 


Hamilton’s Action Principle: As discussed in chapter 9.2, Hamiltonian mechanics is built upon Hamil- 
ton’s action functional 


ups f "Hani (9.1) 
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Hamilton’s Principle of least action states that 
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Generalized momentum p: In chapter 7.2, the generalized (canonical) momentum was defined in terms 
of the Lagrangian L to be 


Pi = TES (7.3) 
Chapter 9.2.2 defined the generalized momentum in terms of the action functional S to be 
05(9, pt) 
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Generalized energy h(q,q,t): Jacobi’s Generalized Energy h(q, q,t) was defined in equation 7.37 as 


had.) = Y (a ABE) - 10,61) (7.37) 
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Hamiltonian function: H(q,p,t) The Hamiltonian H (q, p,t) was defined in terms of the generalized 
energy h(q, å, t) plus the generalized momentum. That is 


H(a,pt) = h(q,4,t) = > pjdj — L(a,4,t) = p: 4-L(a, 4, t) (7.37) 
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where p, q correspond to n-dimensional vectors, e.g. q = (41,42, «+, qn) and the scalar product p-q =}; Pidi- 
Chapter 8.2 used a Legendre transformation to derive this relation between the Hamiltonian and Lagrangian 
functions. Note that whereas the Lagrangian L(q,q,t) is expressed in terms of the coordinates q, plus 
conjugate velocities q, the Hamiltonian H (q,p,t) is expressed in terms of the coordinates q plus their 
conjugate momenta p. For scleronomic systems, using the standard Lagrangian, in equations 7.44 and 7.29, 
shows that the Hamiltonian simplifies to be equal to the total mechanical energy, that is, H =T +U. 
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Generalized energy theorem: The equations of motion lead to the generalized energy theorem which 
states that the time dependence of the Hamiltonian is related to the time dependence of the Lagrangian. 


dH ( 10!) Og OL(q, 4, t) 
= Li lope ae kag t)| — A (7.38) 
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Note that if all the generalized non-potential forces and Lagrange multiplier terms are zero, and if the 
Lagrangian is not an explicit function of time, then the Hamiltonian is a constant of motion. 


Lagrange equations of motion: Equation 6.60 gives that the N Lagrange equations of motion are 
d (OL (a EXC 
p (e) -3 spa 3 wes (a.t) + Q (6.60) 
where j = 1, 2,3,....N. 


Hamilton’s equations of motion: Chapter 8.3 showed that a Legendre transform, plus the Lagrange- 
Euler equations, (9.64,9.65) lead to Hamilton’s equations of motion. Hamilton derived these equations of 
motion directly from the action functional, as shown in chapter 9.2. 


OH (q, p,t) 
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Note the symmetry of Hamilton’s two canonical equations. The canonical variables px, qx are treated 
as independent canonical variables. Lagrange was the first to derive the canonical equations but he did not 
recognize them as a basic set of equations of motion. Hamilton derived the canonical equations of motion 
from his fundamental variational principle and made them the basis for a far-reaching theory of dynamics. 
Hamilton’s equations give 2s first-order differential equations for px, qk for each of the s degrees of freedom. 
Lagrange’s equations give s second-order differential equations for the variables qk, qx. 


Hamilton-Jacobi equation: Hamilton used Hamilton’s Principle plus equation 9.19 to derive the Hamilton- 


Jacobi equation. 
Os 
+ + H(q,p,t) =0 (9.19) 
Ot 
The solution of Hamilton's equations is trivial if the Hamiltonian is a constant of motion, or when a set of 
generalized coordinate can be identified for which all the coordinates q; are constant, or are cyclic (also called 
ignorable coordinates). Jacobi developed the mathematical framework of canonical transformation required 
to exploit the Hamilton-Jacobi equation. 


Hamilton’s Principle applied using initial boundary conditions: The definition of Hamilton's Prin- 
ciple assumes integration between the initial time t; and final time tf. A recent development has extended 
applications of Hamilton's Principle to apply to systems that are defined in terms of only the initial bound- 
ary conditions. This method doubles the number of degrees of freedom and uses a coupling Lagrangian 
K (q2, 42, q1, 41,1) between the corresponding qı and q2 doubled degrees of freedom 


d OL OL d a | Ol Ge her) (9.50) 
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and where Q! is a generalized nonconservative force derived from K. 
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Standard Lagrangians: Derivation of Lagrangian mechanics, using d’Alembert’s principle of virtual 
work, assumed that the Lagrangian is defined by equation 9.52 


This was used in equation 9.3 to derive the action in terms of the fundamental Lagrangian defined by equation 
9.52. The assumption that the action S is the fundamental property inverts this procedure and now equation 
9.3 is used to derived the Lagrangian. That is, the assumption that Hamilton’s Principle is the foundation 
of algebraic mechanics defines the Lagrangian in terms of the fundamental action S. 


Non-standard Lagrangians: The flexibility and power of Lagrangian mechanics can be extended to a 
broader range of dynamical systems by employing an extended definition of the Lagrangian that assumes that 
the action is the fundamental property, and then the Lagrangian is defined in terms of Hamilton’s variational 
action principle using equation 9.2. It was illustrated that the inverse variational calculus formalism can 
be used to identify non-standard Lagrangians that generate the required equations of motion. These non- 
standard Lagrangians can be very different from the standard Lagrangian and do not separate into kinetic 
and potential energy components. These alternative Lagrangians can be used to handle dissipative systems 
which are beyond the range of validity when using standard Lagrangians. That is, it was shown that several 
very different Lagrangians and Hamiltonians can be equivalent for generating useful equations of motion 
of a system. Currently the use of non-standard Lagrangians is a narrow, but active, frontier of classical 
mechanics with important applications to relativistic mechanics. 


Gauge invariance of the standard Lagrangian: It was shown that there is a continuum of equivalent 
standard Lagrangians that lead to the same set of equations of motion for a system. This feature is related 
to gauge invariance in mechanics. The following transformations change the standard Lagrangian, but leave 
the equations of motion unchanged. 


1. The Lagrangian is indefinite with respect to addition of a constant to the scalar potential which cancels 
out when the derivatives in the Euler-Lagrange differential equations are applied. 


2. Similarly the Lagrangian is indefinite with respect to addition of a constant kinetic energy. 


3. The Lagrangian is indefinite with respect to addition of a total time derivative of the form L —> 
L+ 4 [A(q;, t)] for any differentiable function A(q;t) of the generalized coordinates, plus time, that has 
continuous second derivatives. 


Application of Hamilton’s Action Principle to mechanics: The derivation of the equations of mo- 
tion for any system can be separated into a hierarchical set of three stages in both sophistication and 
understanding. Variational principles are employed during the primary “action” stage and secondary “Hamil- 
ton/Lagrangian” stage to derive the required equations of motion, which then are solved during the third 
“equations-of-motion stage”. Hamilton’s Action Principle, is a scalar function that is the basis for deriving 
the Lagrangian and Hamiltonian functions. The primary “action stage” uses Hamilton’s Action functional, 
S= de L(q, 4,t)dt to derive the Lagrangian and Hamiltonian functionals that are based on Hamilton’s 
action functional and provide the most fundamental and sophisticated level of understanding. The second 
“Hamiltonian/Lagrangian stage” involves using the Lagrangian and Hamiltonian functionals to derive the 
equations of motion. The third “equations-of-motion stage” uses the derived equations of motion to solve 
for the motion subject to a given set of initial boundary conditions. The Newtonian mechanics approach 
bypasses the primary “action” stage, as well as the secondary “Hamiltonian/Lagrangian” stage. That is, 
Newtonian mechanics starts at the third “equations-of-motion” stage, which does not allow exploiting the 
considerable advantages provided by use of action, the Lagrangian, and the Hamiltonian. Newtonian me- 
chanics requires that all the active forces be included when deriving the equations of motion, which involves 
dealing with vector quantities. This is in contrast to the action, Lagrangian, and Hamiltonian which are 
scalar functionals. Both the primary “action” stage, and the secondary “Lagrangian/Hamiltonian” stage, 
exploit the powerful arsenal of mathematical techniques that have been developed for exploiting variational 
principles. 


Chapter 10 


Nonconservative systems 


10.1 Introduction 


Hamilton’s action principle, Lagrangian mechanics, and Hamiltonian mechanics, all exploit the concept of 
action which is a single, invariant, quantity. These algebraic formulations of mechanics all are based on 
energy, which is a scalar quantity, and thus these formulations are easier to handle than the vector concept 
of force employed in Newtonian mechanics. Algebraic formulations provide a powerful and elegant approach 
to understand and develop the equations of motion of systems in nature. Chapters 6 — 9 applied variational 
principles to Hamilton’s action principle which led to the Lagrangian, and Hamiltonian formulations that 
simplify determination of the equations of motion for systems in classical mechanics. 

A conservative force has the property that the total work done moving between two points is independent 
of the taken path. That is, a conservative force is time symmetric and can be expressed in terms of the 
gradient of a scalar potential V. Hamilton’s action principle implicitly assumes that the system is conservative 
for those degrees of freedom that are built into the definition of the action, and the related Lagrangian, and 
Hamiltonian. The focus of this chapter is to discuss the origins of nonconservative motion and how it can 
be handled in algebraic mechanics. 


10.2 Origins of nonconservative motion 


Nonconservative degrees of freedom involve irreversible processes, such as dissipation, damping, and also 
can result from course-graining, or ignoring coupling to active degrees of freedom. The nonconservative role 
of ignored active degrees of freedom is illustrated by the weakly-coupled double harmonic oscillator system 
discussed below. Let the two harmonic oscillators have masses (m1, m2), uncoupled angular frequencies 
(w1,w2), and oscillation amplitudes (q1,q2). Assume that the coupling potential energy is U = Aqíq2. The 
Lagrangian for this weakly-coupled double oscillator is 


A mi y. ma y. 
L(q1,92, 41, 42, t) = > (Gi A wiqi) + Aq192 + > (43 F w342) (10.1) 


Note that the total Lagrangian is conservative since the Lagrangian is explicitly time independent. As shown 
in chapter 14.2, the solution for the amplitudes of the oscillation for the coupled system are given by 


Dsin (272) r| sin (232) ) (10.2) 
Deos (237) J cos (232) J (10.3) 


The system exhibits the common “beats” behavior where the coupled harmonic oscillators have an angular 
frequency that is the average oscillator frequency Waverage = (222) , and the oscillation intensities are 
modulated at the difference frequency, Wdifference = (25%). Although the total energy is conserved 
for this conservative system, this shared energy flows back and forth between the two coupled harmonic 
oscillators at the difference frequency. If the equations of motion for oscillator 1 ignore the coupling to the 


qı (t) 


q2 (t) 
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motion of oscillator 2, that is, assume a constant average value q2 = (q2) is used, then the intensity |q1|” and 
energy of the first oscillator still is modulated by the [sin (25%) t? term. Thus the total energy for this 
truncated coupled-oscillator system is no longer conserved due to neglect of the energy flowing into and out 
of oscillator 1 due to its coupling to oscillator 2. That is, the solution for the truncated system of oscillator 
1 is not conservative since it is exchanging energy with the coupled, but ignored, second oscillator. This 
elementary example illustrates that ignoring active degrees of freedom can transform a conservative system 
into a nonconservative system, for which the equations of motion derived using the truncated Lagrangian is 
incorrect. 

The above example illustrates the importance of including all active degrees of freedom when deriving the 
equations of motion, in order to ensure that the total system is conservative. Unfortunately, nonconservative 
systems due to viscous or frictional dissipation typically result from weak thermal interactions with an 
enormous number of nearby atoms, which makes inclusion of all of these degrees of freedom impractical. 
Even though the detailed behavior of such dissipative degrees of freedom may not be of direct interest, all 
the active degrees of freedom must be included when applying Lagrangian or Hamiltonian mechanics. 


10.3 Algebraic mechanics for nonconservative systems 


Since Lagrangian and Hamiltonian formulations are invalid for the nonconservative degrees of freedom, the 
following three approaches are used to include nonconservative degrees of freedom directly in the Lagrangian 
and Hamiltonian formulations of mechanics. 


1. Expand the number of degrees of freedom used to include all active degrees of freedom for the system, 
so that the expanded system is conservative. This is the preferred approach when it is viable. Hamil- 
ton’s action principle based on initial conditions, introduced in chapter 9.2.4, doubles the number of 
degrees of freedom, which can be used to account for the dissipative forces providing one approach to 
solve nonconservative systems. However, this approach typically is impractical for handling dissipated 
processes because of the large number of degrees of freedom that are involved in thermal dissipation. 


2. Nonconservative forces can be introduced directly at the equations of motion stage as generalized forces 
QF C. This approach is used extensively. For the case of linear velocity dependence, the Rayleigh’s 
dissipation function provides an elegant and powerful way to express the generalized forces in terms of 
scalar potential energies. 


3. New degrees of freedom or effective forces can be postulated that are then incorporated into the 
Lagrangian or the Hamiltonian in order to mimic the effects of the nonconservative forces. 


Examples that exploit the above three ways to introduce nonconservative dissipative forces in algebraic 
formulations are given below. 


10.4 Rayleigh’s dissipation function 


As mentioned above, nonconservative systems involving viscous or frictional dissipation, typically result from 
weak thermal interactions with many nearby atoms, making it impractical to include a complete set of active 
degrees of freedom. In addition, dissipative systems usually involve complicated dependences on the velocity 
and surface properties that are best handled by including the dissipative drag force explicitly as a generalized 
drag force in the Euler-Lagrange equations. The drag force can have any functional dependence on velocity, 
position, or time. 

Fly — — f(4,q,1)0 (10.4) 


Note that since the drag force is dissipative the dominant component of the drag force must point in the 
opposite direction to the velocity vector. 

In 1881 Lord Rayleigh[Ray1881, Ray1887] showed that if a dissipative force F depends linearly on velocity, 
it can be expressed in terms of a scalar potential functional of the generalized coordinates called the Rayleigh 
dissipation function R(q). The Rayleigh dissipation function is an elegant way to include linear velocity- 
dependent dissipative forces in both Lagrangian and Hamiltonian mechanics, as is illustrated below for both 
Lagrangian and Hamiltonian mechanics. 
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10.4.1 Generalized dissipative forces for linear velocity dependence 


Consider n equations of motion for the n degrees of freedom, and assume that the dissipation depends linearly 
on velocity. Then, allowing all possible cross coupling of the equations of motion for qj, the equations of 
motion can be written in the form 


M: 
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Multiplying equation 10.5 by q; , take the time integral, and sum over i, j, gives the following energy equation 


DA muda + Y f boda + Y f cijajQidt = yf au Qi(t)qidt (10.6) 
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The right-hand term is the total energy supplied to the system by the external generalized forces Q;(t) 
at the time t. The first time-integral term on the left-hand side is the total kinetic energy, while the third 
time-integral term equals the potential energy. The second integral term on the left is defined to equal 2R(q) 
where Rayeigh’s dissipation function R(q) is defined as 


RAS Y byki (10.7) 
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and the summations are over all n particles of the system. This definition allows for complicated cross- 
coupling effects between the n particles. 

The particle-particle coupling effects usually can be neglected allowing use of the simpler definition that 
includes only the diagonal terms. Then the diagonal form of the Rayleigh dissipation function simplifies to 


R(Y=5 dE did; (10.8) 


Therefore the frictional force in the q; direction depends linearly on velocity q;, that is 


OR(q) 
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In general, the dissipative force is the velocity gradient of the Rayleigh dissipation function, 
f — -VaR (å) (10.10) 


The physical significance of the Rayleigh dissipation function is illustrated by calculating the work done 
by one particle ¿ against friction, which is 


dWf =—F! . dr = -FÌ . q ¿dt = biġ? dt (10.11) 
Therefore + 
d 
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which is the rate of energy (power) loss due to the dissipative forces involved. The same relation is obtained 
after summing over all the particles involved. 
Transforming the frictional force into generalized coordinates requires equation 6.27 


a Oris +2 (10.13) 


Note that the derivative with respect to qx equals 
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Using equations 6.28 and 6.29, the 7 component of the generalized frictional force Qf is given by 


F’. E Fi. d- od Or; = OR(q) 
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Equation 10.15 provides an elegant expression for the generalized dissipative force Qf in terms of the 
Rayleigh’s scalar dissipation potential R. 


10.4.2 Generalized dissipative forces for nonlinear velocity dependence 


The above discussion of the Rayleigh dissipation function was restricted to the special case of linear velocity- 
dependent dissipation. Virga[Vir15] proposed that the scope of the classical Rayleigh-Lagrange formalism 
can be extended to include nonlinear velocity dependent dissipation by assuming that the nonconservative 


dissipative forces are defined by 
AR(q, å) 


Ff 10.16 
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where the generalized Rayleigh dissipation function R(q, q) satisfies the general Lagrange mechanics relation 
óL OR 
Bye 10.17 
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This generalized Rayleigh's dissipation function eliminates the prior restriction to linear dissipation processes, 
which greatly expands the range of validity for using Rayleigh's dissipation function. 


10.4.3 Lagrange equations of motion 


Linear dissipative forces can be directly, and elegantly, included in Lagrangian mechanics by using Rayleigh's 
dissipation function as a generalized force Qf . Inserting Rayleigh dissipation function 10.15 in the generalized 
Lagrange equations of motion 6.60 gives 


d (OL Ik EXC IR(a, 4) 
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Where QF XC corresponds to the generalized forces remaining after removal of the generalized linear, velocity- 
dependent, frictional force Q! . The holonomic forces of constraint are absorbed into the Lagrange multiplier 


term. 
10.4.4 Hamiltonian mechanics 


If the nonconservative forces depend linearly on velocity, and are derivable from Rayleigh’s dissipation 
function according to equation 10.15, then using the definition of generalized momentum gives 
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Thus Hamilton’s equations become 
= o (10.21) 
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The Rayleigh dissipation function R(q, q) provides an elegant and convenient way to account for dissi- 
pative forces in both Lagrangian and Hamiltonian mechanics. 
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10.1 Example: Driven, linearly-damped, coupled linear oscillators 


Consider the two identical, linearly damped, coupled 
oscillators (damping constant B) shown in the figure. A 


periodic force F = Fo cos(wt) is applied to the left-hand A ee v! Sa el 
mass m. The kinetic energy of the system is ix," px | 
1 
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Harmonically-driven, linearly-damped, coupled 


The potential energy is linear oscillators. 


1 1 1 
U = zei + shea + z” (£2 — 21) = a (K+ 4) 0? + 5 (K+ 1) 22103 


Thus the Lagrangian equals 


1 1 1 
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Since the damping is linear, it is possible to use the Rayleigh dissipation function 
R =3 Má] +43) 
The applied generalized forces are 
Qi = Focos (wt) QL =0 


Use the Euler-Lagrange equations 10.18 to derive the equations of motion 
d =) am OF oo tee ôk 
=|=]|l-=—p+++=Q;+ Ar — (q, t 
la a 05) 05 2, NETA ) 
gives 
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These two coupled equations can be decoupled and simplified by making a transformation to normal coor- 
dinates, 11,2 Where 


Ni = T1 — T2 No = 11 + T2 
Thus 
1 1 
21 = (1, + No) ©2 = = (N2 — M1) 
2 2 
Insert these into the equations of motion gives 

m(%, +%2) + Bl + Ne) + (K+ K”) (M +n) — UB =n) = 2Fpcos (wt) 
m(n — m) + B(g —=9)+(5+5)(M9-m)-* (Mm, +9) = 0 


Add and subtract these two equations gives the following two decoupled equations 


x b. (k+ 2k’) Fo 
y+ =I, + ———7, = — cos (wt) 
m m m 


es B . K Fo 
E a ZO Ge DiF 
Ma + me + e are (wt) 


Define T = £ w =4/ e ig =/£,A= ta Then the two independent equations of motion become 


Mm +r + w?n = Acos (wt) Ma + Tha + wana = Acos (wt) 
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This solution is a superposition of two independent, linearly-damped, driven normal modes ny, and na that 
have different natural frequencies wı and w2. For weak damping these two driven normal modes each undergo 


Pere 2 
damped oscillatory motion with the n, and y, normal modes exhibiting resonances at wi, = \/ w? — 2 (5) 


and wh = \/w3 —2 E 


10.2 Example: Kirchhoff’s rules for electrical circuits 


The mathematical equations governing the behavior of mechanical systems and LRC electrical circuits 
have a close similarity. Thus variational methods can be used to derive the analogous behavior for electrical 
circuits. For example, for a system of n separate circuits, the magnetic flux Dix through circuit i, due to 
electrical current I, = qx flowing in circuit k, is given by 

Dix = Mix dr 


where Mix is the mutual inductance. The diagonal term Mi; = Li corresponds to the self inductance of 
circuit i. The net magnetic flux ®; through circuit i, due to all n circuits, is the sum 


Dd, = 5 Mikåk 
k=1 


Thus the total magnetic energy Wmag,which is analogous to kinetic energy T, is given by summing over all 


n circuits to be 
1 n n 
Winag =T = 3 D So Mindi Ge 
t=1 k=l 
Similarly the electrical energy Wetec stored in the mutual capacitance Cig between the n circuits, which 
is analogous to potential energy, U, is given by 


IAE .. Gk 
L=T-U=5Y Y) [Mais s] (a) 


Assuming that Ohm’s Law is obeyed, that is, the dissipation force depends linearly on velocity, then the 
Rayleigh dissipation function can be written in the form 


=D) Y Radir (8) 


i=1 k=1 


where Rig is the resistance matrix. Thus the dissipation force, expressed in volts, is given by 
OR 1 : 
F,=-2 => Rirón (7) 


Inserting equations a, 2, and y into equation 10.18, plus making the assumption that an additional gen- 
eralized electrical force Q; = €,(t) volts is acting on circuit i, then the Euler-Lagrange equations give the 


following equations of motion. 
n 


5 [Man + Rikk + a 
i, ik 


This is a generalized version of Kirchhoff’s loop rule which can be seen by considering the case where the 
diagonal term i = k is the only non-zero term. Then 


=é,(t) 


Ma + Rugs + E = €,(t) 


This sum of the voltages is identical to the usual expression for Kirchhoff’s loop rule. This example 
illustrates the power of variational methods when applied to fields beyond classical mechanics. 
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10.5 Dissipative Lagrangians 


The prior discussion of nonconservative systems mentioned the following three ways to incorporate dissipative 
processes into Lagrangian or Hamiltonian mechanics. (1) Expand the number of degrees of freedom to include 
all the active dissipative active degrees of freedom as well as the conservative ones. (2) Use generalized forces 
to incorporate dissipative processes. (3) Add dissipative terms to the Lagrangian or Hamiltonian to mimic 
dissipation. The following illustrates the use of dissipative Lagrangians. 

Bateman[Bat31] pointed out that an isolated dissipative system is physically incomplete, that is, a com- 
plete system must comprise at least two coupled subsystems where energy is transferred from a dissipating 
subsystem to an absorbing subsystem. A complete system should comprise both the dissipating and ab- 
sorbing systems to ensure that the total system Lagrangian and Hamiltonian are conserved, as is assumed 
in conventional Lagrangian and Hamiltonian mechanics. Both Bateman and Dekker[Dek75] have illustrated 
that the equations of motion for a linearly-damped, free, one-dimensional harmonic oscillator are derivable 
using the Hamilton variational principle via introduction of a fictitious complementary subsystem that mim- 
ics dissipative processes. The following example illustrate that deriving the equations of motion for the 
linearly-damped, linear oscillator may be handled by three alternative equivalent non-standard Lagrangians 
that assume either: (1) a multidimensional system, (2) explicit time dependent Lagrangians and Hamiltoni- 
ans, or (3) complex non-standard Lagrangians. 


10.3 Example: The linearly-damped, linear oscillator: 


Three toy dynamical models have been used to describe the linearly-damped, linear oscillator employing 
very different non-standard Lagrangians to generate the required Hamiltonians, and to derive the correct 
equations of motion. 

1: Dual-component Lagrangian: LpDual 

Bateman proposed a dual system comprising a mass m subject to two coupled one-dimensional variables 
(x,y) where x is the observed variable and y is the mirror variable for the subsystem that absorbs the energy 
dissipated by the subsystem x. 

Assume a non-standard Lagrangian of the form 


ELN A Gee f 
Lpuat = 5 oy — 5 [yt — 29) — wozy (a) 


where T = 2 is the damping coefficient. Minimizing by variation of the auxiliary variable y, that is, AL = 0, 
m 
leads to the uncoupled equation of motion for x 


> [ë +T +30] =0 (b) 


Similarly minimizing by variation of the primary variable x, that is Ay L = 0, leads to the uncoupled equation 
of motion for y 

> li -Tý +w] =0 (c) 
Note that equation of motion (b), which was obtained by variation of the auxiliary variable y, corresponds 
to that for the usual free, linearly-damped, one-dimensional harmonic oscillator for the x variable which 
dissipates energy as is discussed in chapter 3.5. The equation of motion (c) is obtained by variation of the 
primary variable x and corresponds to a free linear, one-dimensional, oscillator for the y variable that is 
absorbing the energy dissipated by the dissipating x system. 

The generalized momenta, 

OL 
ôdi 


Pi = 


can be used to derive the corresponding Hamiltonian 


. ; PxP Tr m IRA 
Apuat(£, Pr, Y, Py) = [Prt + pyy — L] = A aS [EPa — Ypy] + + (a - (5) ry (d) 
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Note that this Hamiltonian is time independent, and thus is conserved for this complete dual-variable system. 
Using Hamilton’s equations of motion gives the same two uncoupled equations of motion as obtained using 
the Lagrangian, i.e. (b) and (c). 

2: Time-dependent Lagrangian: Lpamped 

The complementary subsystem of the above dual-component Lagrangian, that is added to the primary 
dissipative subsystem, is the adjoint to the equations for the primary subsystem of interest. In some cases, a 
set of the solutions of the complementary equations can be expressed in terms of the solutions of the primary 
subsystem allowing the equations of motion to be expressed solely in terms of the variables of the primary 
subsystem. Inspection of the solutions of the damped harmonic oscillator, presented in chapter 3.5, implies 
that x and y must be related by the function 

y = ze"! (e) 


Therefore Bateman proposed a time-dependent, non-standard Lagrangian LDampea of the form 


m : 
L Damped = ao [i — wja”] (f) 


This Lagrangian LDampes corresponds to a harmonic oscillator for which the mass m = moe!™ is accretin 
p 


exponentially with time in order to mimic the exponential energy dissipation. Use of this Lagrangian in the 
Euler-Lagrange equations gives the solution 


mel [# + Ta + wor] = 0 (9) 


If the factor outside of the bracket is non-zero, then the equation in the bracket must be zero. The expression 
in the bracket is the required equation of motion for the linearly-damped linear oscillator. This Lagrangian 
generates a generalized momentum of 

Pr = me E 


and the Hamiltonian is 


Pr re, Mare 
Hpamped = Prt -l2 = mE t + y Woe ta (h) 


The Hamiltonian is time dependent as expected. This leads to Hamilton's equations of motion 


E OH Damped = Po „Tt (i) 
pz m 
. OH Damped 2 Tt % 
= = rT = MWE T 
Pa Ox 0 (5) 


Take the total time derivative of equation h and use equation i to substitute for pz gives 
melt [ë +rt +w] =0 (k) 


If the term me" is non-zero, then the term in brackets is zero. The term in the bracket is the usual equation 
of motion for the linearly-damped harmonic oscillator. 

3: Complex Lagrangian: Lcomplez 

Dekker proposed use of complex dynamical variables for solving the linearly-damped harmonic oscillator. 
It exploits the fact that, in principle, each second order differential equation can be expressed in terms of 
a set of first-order differential equations. This feature is the essential difference between Lagrangian and 
Hamiltonian mechanics. Let q be complex and assume it can be expressed in the form of a real variable x as 


Tr 
azi- (iw+z)o (1) 
Substituting this complex variable into the relation 
; f r 
q+ |iw + de 0 (m) 
leads to the second-order equation for the real variable x of 


#+Te+we =0 (n) 
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This is the desired equation of motion for the linearly-damped harmonic oscillator. This result also can be 
shown by taking the time derivative of equation (m) and taking only the real part, i.e. 


T T 3 2s ; 
G+ iw + za=d+ (te 5) 4404 = G+ TG tube =0 (o) 


This feature is exploited using the following Lagrangian 


Loomplex = 2 (q q— qq ) a a 5] qq (p) 


where w? = we — eam The Lagrangian Lcomplew is real for a conservative system and complex for a 
dissipative system. Using the Lagrange-Euler equation for variation of q*, that is, AgxLComplex = 0, gives 
equation (m) which leads to the required equation of motion (n). 


The canonical conjugate momenta are given by 


— OL Complex ~ OL Complex 
Dp = P= aes (q) 


The above Lagrangian plus canonically conjugate momenta lead to the complimentary Hamiltonians 


E 4 3) (p-q* — pg) (s) 


H Complex (p, q, Ď, q”) 


HComplez(P, QD, gq”) = E a 5) (p-q* = pq) (r) 
These Hamiltonians give Hamilton equations of motion that lead to the correct equations of motion for q 
and q* 

The above examples have shown that three very different, non-standard, Lagrangians, plus their corre- 
sponding Hamiltonians, all lead to the correct equation of motion for the linearly-damped harmonic oscilla- 
tor. This illustrates the power of using non-standard Lagrangians to describe dissipative motion in classical 
mechanics. However, postulating non-standard Lagrangians to produce the required equations of motion 
appears to be of questionable usefulness. A fundamental approach is needed to build a firm foundation upon 
which non-standard Lagrangian mechanics can be based. Non-standard Lagrangian mechanics remains an 
active, albeit narrow, frontier of classical mechanics 


10.6 Summary 


Dissipative drag forces are non-conservative and usually are velocity dependent. Chapter 4 showed that the 
motion of non-linear dissipative dynamical systems can be highly sensitive to the initial conditions and can 
lead to chaotic motion. 


Algebraic mechanics for nonconservative systems Since Lagrangian and Hamiltonian formulations 
are invalid for the nonconservative degrees of freedom, the following three approaches are used to include 
nonconservative degrees of freedom directly in the Lagrangian and Hamiltonian formulations of mechanics. 


1. Expand the number of degrees of freedom used to include all active degrees of freedom for the system, 
so that the expanded system is conservative. This is the preferred approach when it is viable. Unfor- 
tunately this approach typically is impractical for handling dissipated processes because of the large 
number of degrees of freedom that are involved in thermal dissipation. 


2. Nonconservative forces can be introduced directly at the equations of motion stage as generalized forces 
QF7* C. This approach is used extensively. For the case of linear velocity dependence, the Rayleigh's 
dissipation function provides an elegant and powerful way to express the generalized forces in terms of 
scalar potential energies. 


3. New degrees of freedom or effective forces can be postulated that are then incorporated into the 
Lagrangian or the Hamiltonian in order to mimic the effects of the nonconservative forces. 
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Rayleigh’s dissipation function Generalized dissipative forces that have a linear velocity dependence 
can be easily handled in Lagrangian or Hamiltonian mechanics by introducing the powerful Rayleigh’s 
dissipation function R(q) where 
ee a ee 
R(4=3 > bitid; (10.7) 
i=1 j=1 

This approach is used extensively in physics. This approach has been generalized by defining a linear velocity 
dependent Rayleigh dissipation function 


OR(a, å) 
| =— 10.16 
=- (10.16) 
where the generalized Rayleigh dissipation function R(q, q) satisfies the general Lagrange mechanics relation 
óL OR 
—-=->—=0 10.17 
e o (10.17) 


This generalized Rayleigh’s dissipation function eliminates the prior restriction to linear dissipation processes, 
which greatly expands the range of validity for using Rayleigh’s dissipation function. 


Rayleigh dissipation in Lagrange equations of motion Linear dissipative forces can be directly, and 
elegantly, included in Lagrangian mechanics by using Rayleigh’s dissipation function as a generalized force 
Qf . Inserting Rayleigh dissipation function 10.15 in the generalized Lagrange equations of motion 6.60 gives 


d (OL OL OG: excl  0R(9,4) 
NS MA (a, t) +Q? a A 10.18 
{ ral 7) x} y oa) +) T (10.18) 


Where QF XC corresponds to the generalized forces remaining after removal of the generalized linear, velocity- 


dependent, frictional force Qf . The holonomic forces of constraint are absorbed into the Lagrange multiplier 
term. 


Rayleigh dissipation in Hamiltonian mechanics If the nonconservative forces depend linearly on 
velocity, and are derivable from Rayleigh’s dissipation function according to equation 10.15, then using the 
definition of generalized momentum gives 


a d OL OL a 09% EXC o (q, å) 

hd Oe OE NS She a | Lae 10.19 

p dt 0q; qi » i qj (at) Q; 0d; ( ) 
OE + [Nr E gorro] ad) 10.20 

E Odi k=1 k qj (at) 0 0g; 


Thus Hamilton’s equations become 


E _ (10.21) 
OH ee FOUR exc|  0R(g,4) 

D E eee MA (qt) +0! E e 10.22 
p 24; y k q; (q, t) Q; dG; ( ) 


The Rayleigh dissipation function R(q, q) provides an elegant and convenient way to account for dissi- 
pative forces in both Lagrangian and Hamiltonian mechanics. 


Dissipative Lagrangians or Hamiltonians New degrees of freedom or effective forces can be postulated 
that are then incorporated into the Lagrangian or the Hamiltonian in order to mimic the effects of the 
nonconservative forces. This approach has been used for special cases. 


Chapter 11 


Conservative two-body central forces 


11.1 Introduction 


Conservative two-body central forces are important in physics because of the pivotal role that the Coulomb 
and the gravitational forces play in nature. The Coulomb force plays a role in electrodynamics, molecular, 
atomic, and nuclear physics, while the gravitational force plays an analogous role in celestial mechanics. 
Therefore this chapter focusses on the physics of systems involving conservative two-body central forces 
because of the importance and ubiquity of these conservative two-body central forces in nature. 

A conservative two-body central force has the following three important attributes. 


1. Conservative: A conservative force depends only on the particle position, that is, the force is not 
time dependent. Moreover the work done by the force moving a body between any two points 1 and 2 
is path independent. Conservative fields are discussed in chapter 2.10. 


2. Two-body: A two-body force between two bodies depends only on the relative locations of the two 
interacting bodies and is not influenced by the proximity of additional bodies. For two-body forces 
acting between n bodies, the force on body 1 is the vector superposition of the two-body forces due 
to the interactions with each of the other n — 1 bodies. This differs from three-body forces where the 
force between any two bodies is influenced by the proximity of a third body. 


3. Central: A central force field depends on the distance r,2 from the origin of the force at point 1, to 
the body location at point 2, and the force is directed along the line joining them, that is, fi. 


A conservative, two-body, central force combines the above three attributes and can be expressed as, 


Foi=f (ri2)F 15 (11.1) 


The force field Fo; has a magnitude f(r12) that depends only on the magnitude of the relative separation 
vector r12 = ra — rı between the origin of the force at point 1 and point 2 where the force acts, and the force 
is directed along the line joining them, that is, 12. 

Chapter 2.10 showed that if a two-body central force is conservative, then it can be written as the gradient 
of a scalar potential energy U(r) which is a function of the distance from the center of the force field. 


F21 = -VU (r12) (11.2) 


As discussed in chapter 2, the ability to represent the conservative central force by a scalar function U(r) 
greatly simplifies the treatment of central forces. 

The Coulomb and gravitational forces both are true conservative, two-body, central forces whereas the 
nuclear force between nucleons in the nucleus has three-body components. Two bodies interacting via a 
two-body central force is the simplest possible system to consider, but equation 11.1 is applicable equally 
for n bodies interacting via two-body central forces because the superposition principle applies for two-body 
central forces. This chapter will focus first on the motion of two bodies interacting via conservative two-body 
central forces followed by a brief discussion of the motion for n > 2 interacting bodies. 
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11.2 Equivalent one-body representation for two-body motion 


The motion of two bodies, 1 and 2, interacting via two-body 
central forces, requires 6 spatial coordinates, that is, three each 
for rı and rg. Since the two-body central force only depends on 
the relative separation r = rı — ra of the two bodies, it is more 
convenient to separate the 6 degrees of freedom into 3 spatial 
coordinates of relative motion r, plus 3 spatial coordinates for 
the center-of-mass location R. as described in chapter 2.7. It will 
be shown here that the equation of motion for relative motion 
of the two-bodies in the center of mass can be represented by an 
equivalent one-body problem which simplifies the mathematics. 

Consider two bodies acted upon by a conservative two-body 
central force, where the position vectors rı and ra specify the 
location of each particle as illustrated in figure 11.1. An alternate 
set of six variables would be the three components of the center 
of mass position vector R and the three components specifying 
the difference vector r defined by figure 11.1. Define the vectors 
r) and r as the position vectors of the masses mı and ma with 
respect to the center of mass. Then 


Figure 11.1: Center of mass cordinates for 


r = Re+r (11.3) the two-body system. 


ro = R+r, 


By the definition of the center of mass 


ja ered ae a2 (11.4) 
mi + Ma 
and 
mir, + mar), =0 (11.5) 
so that a 
lus / 
-—r, = 11.6 
ma rı =P) ( ) 
Therefore 
+ Ma 
rar =y AS 11.7 
1 2 rs 1 ( ) 
that is, 
/ ma 
t = ———r 11.8 
airy (11.8) 
Similarly; 
1 m1 
r, = —————_r 11.9 
2 mi + Ma ) 
Substituting these into equation 11.3 gives 
r = R+r=R+ a 
mi + Ma 
ma 
r = R+r =R- —r 11.10 
2 2 DENT ( ) 


That is, the two vectors r¡,ra2 are written in terms of the position vector for the center of mass R and the 
position vector r for relative motion in the center of mass. 

Assuming that the two-body central force is conservative and represented by U(r), then the Lagrangian 
of the two-body system can be written as 


1 1 
L= 5m lp? + zm Ira]? — U(r) (11.11) 
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Differentiating equations 11.10, with respect to time, and inserting them into the Lagrangian, gives 


1 212 1 2 
L=5M IR | + mii? — U(r) (11.12) 
where the total mass M is defined as 
and the reduced mass y is defined by 
Mı mo 
= == 11.14 
ETT (11.14) 
or equivalently 
1 1 1 
al (11.15) 


H my ma 


The total Lagrangian can be separated into two independent parts 
1 . 12 
L=5M IR ES ae (11.16) 


where 1 

Lem = 5H lèl? — U(r) (11.17) 
Assuming that no external forces are acting, then oh = 0 and the three Lagrange equations for each of the 
three coordinates of the R coordinate can be written as 


d OL _ Pen 
dtaR dt 


=0 (11.18) 
That is, for a pure central force, the center-of-mass momentum Pem is a constant of motion where 
Pom = —~ = MR (11.19) 


It is convenient to work in the center-of-mass frame using 
the effective Lagrangian Lem. In the center-of-mass frame of 


. 12 
reference, the translational kinetic energy iM Re associated 


with center-of-mass motion is ignored, and only the energy in 
the center-of-mass is considered. This center-of-mass energy 
is the energy involved in the interaction between the colliding 
bodies. Thus, in the center-of-mass, the problem has been re- 


duced to an equivalent one-body problem of a mass u moving 
about a fixed force center with a path given by r which is the 
separation vector between the two bodies, as shown in figure 
11.2. In reality, both masses revolve around their center of 
mass, also called the barycenter, in the center-of-mass frame 
as shown in figure 11.2. Knowing r allows the trajectory of 
each mass about the center of mass r) and rj, to be calcu- at 
lated. Of course the true path in the laboratory frame of 
reference must take into account both the translational mo- 
tion of the center of mass, in addition to the motion of the 
equivalent one-body representation relative to the barycenter. 
Be careful to remember the difference between the actual tra- 
jectories of each body, and the effective trajectory assumed 
when using the reduced mass which only determines the rel- 
ative separation r of the two bodies. This reduction to an 
equivalent one-body problem greatly simplifies the solution 
of the motion, but it misrepresents the actual trajectories and the spatial locations of each mass in space. 
The equivalent one-body representation will be used extensively throughout this chapter. 


Figure 11.2: Orbits of a two-body system 
with mass ratio of 2 rotating about the 
center-of-mass, O. The dashed ellipse is the 
equivalent one-body orbit with the center of 
force at the focus O. 
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11.3 Angular momentum L 


The notation used for the angular momentum vector is L where the magnitude is designated by |L| = z. 
Be careful not to confuse the angular momentum vector L with the Lagrangian Lem. Note that the angular 
momentum for two-body rotation about the center of mass with angular velocity w is identical when evaluated 
in either the laboratory or equivalent two-body representation. That is, using equations 11.8 and 11.9 


L = mrw + mr’ w =urw (11.20) 


The center-of-mass Lagrangian leads to the following two general properties regarding the angular mo- 
mentum vector L. 
1) The motion lies entirely in a plane perpendicular to the fixed direction of the total angular momentum 
vector. This is because 
L-r=rxp-r=0 (11.21) 


that is, the radius vector is in the plane perpendicular to the total angular momentum vector. Thus, it is 
possible to express the Lagrangian in polar coordinates, (r, y) rather than spherical coordinates. In polar 
coordinates the center-of-mass Lagrangian becomes 


Pose 7 (i? 4 r24) - U(r) (11.22) 


2) If the potential is spherically symmetric, then the polar angle w is cyclic and therefore Noether’s 
theorem gives that the angular momentum py = L = r x pis a constant of motion. That is, since Onan =0, 


then the Lagrange equations imply that 


_ 4 Lem 
dt aap 


Py =0 (11.23) 
where the vectors py, and » imply that equation 11.23 refers to three independent equations corresponding 
to the three components of these vectors. Thus the angular momentum py, conjugate to p, is a constant of 
motion. The generalized momentum py, is a first integral of the motion which equals 


Py = OLem 
Y 9 ap 
where the magnitude of the angular momentum J, and the direction p,,, both are constants of motion. 


A simple geometric interpretation of equation 11.24 is illus- 
trated in figure 11.3. The radius vector sweeps out an area dA 


= pr = pyl (11.24) 


in time dt where 1 y 
dA = ar x vdt (11.25) 

and the vector A is perpendicular to the x — y plane. The rate QA 
of change of area is 

o (11.26) 

—=-rxv . 

dt 2 
But the angular momentum is r+dr 

dA 
L=rxp= prx v= 24 (11.27) j 


Thus the conservation of angular momentum implies that the 
areal velocity d4 also is a constant of motion. This fact is called 
Kepler’s second law of planetary motion which he deduced in 
1609 based on Tycho Brahe’s 55 years of observational records x 
of the motion of Mars. Kepler’s second law implies that a 
planet moves fastest when closest to the sun and slowest when 
farthest from the sun. Note that Kepler’s second law is a state- 
ment of the conservation of angular momentum which is inde- Figure 11.3: Area swept out by the radius 
pendent of the radial form of the central potential. vector in the time dt. 


11.4. EQUATIONS OF MOTION 253 


11.4 Equations of motion 


The equations of motion for two bodies interacting via a conservative two-body central force can be deter- 
mined using the center of mass Lagrangian, Lem, given by equation 11.22. For the radial coordinate, the 
operator equation A„Lem = 0 for Lagrangian mechanics leads to 


d .2 OU 
— (yr) — — = 11.2 
E (ut) — rd + = =0 (11.28) 
But 1 
j = — 11.2 
E (11.29) 
therefore the radial equation of motion is 
ðU rP 
a 11. 
př ae ae (11.30) 


Similarly, for the angular coordinate, the operator equation Ay Lem = 0 leads to equation 11.24. That is, 

the angular equation of motion for the magnitude of py is 

_ OL 
0% 

Lagrange's equations have given two equations of motion, one dependent on radius r and the other on 


the polar angle 4. Note that the radial acceleration is just a statement of Newton's Laws of motion for the 
radial force F, in the center-of-mass system of 


Py = pry =l (11.31) 


F, =- + — (11.32) 


This can be written in terms of an effective potential 


Ê 


Uers(r) = U(r) + (11.33) l 
ý au U(r) 
which leads to an equation of motion 
E OUe fe (7) 
r = pr = - ——_— 11.34 
F, = př ar (11.34) 


2 
Since E = pry , the second term in equation (11.33) 


is the usual centrifugal force that originates because the 0 
variable r is in a non-inertial, rotating frame of reference. 

Note that the angular equation of motion is independent Uin 
of the radial dependence of the conservative two-body 

central force. 

Figure 11.4 shows, by dashed lines, the radial depen- / 
dence of the potential corresponding to the attractive y 
inverse square law force, that is U = -£, and the po- 1 
tential corresponding to the centrifugal term 3 cor- 
responding to a repulsive centrifugal force. The sum of 
these two potentials U.s f(r), shown by the solid line, 
has a minimum Umin value at a certain radius similar 
to that manifest by the diatomic molecule discussed in 
example 2.7. 

It is remarkable that the six-dimensional equations 
of motion, for two bodies interacting via a two-body 
central force, has been reduced to trivial center-of-mass translational motion, plus a one-dimensional one- 


body problem given by (11.34) in terms of the relative separation r and an effective potential Ue y, (1). 


Figure 11.4: The attractive inverse-square law po- 
2 

tential (Ë), the centrifugal potential (som), and 

the combined effective bound potential. 
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11.5 Differential orbit equation: 


The differential orbit equation relates the shape of the orbital motion, in plane polar coordinates, to the 
radial dependence of the two-body central force. A Binet coordinate transformation, which depends on the 
functional form of F(r), can simplify the differential orbit equation. For the inverse-square law force, the 
best Binet transformed variable is u which is defined to be 
1 


= 11.35 
u=- (11.35) 
Inserting the transformed variable u into equation 11.29 gives 
. lu? 
p = — 11.36) 
7 ( 
From the definition of the new variable 
d d du - ld 
E a (11.37) 
dt dt dy u dy 
Differentiating again gives 
E ld (d lu\* @ 
Lae eee oe eee (11.38) 
dt? udt \ dw uj dy 
Substituting these into Lagrange’s radial equation of motion gives 
du wi Ai 
= F 11.39 
dy? Ty 12 u? Gy, ( ) 


Binet’s differential orbit equation directly relates y and r which determines the overall shape of the orbit 
trajectory. This shape is crucial for understanding the orbital motion of two bodies interacting via a two- 
body central force. Note that for the special case of an inverse square-law force, that is where F(2) = ku?, 
then the right-hand side of equation 11.39 equals a constant E since the orbital angular momentum is a 
conserved quantity. 


11.1 Example: Central force leading to a circular orbit r =2Rcos0 


Binet's differential orbit equation can be used to derive the 
central potential that leads to the assumed circular trajectory 
ofr = 2Rcos0 where R is the radius of the circular orbit. 
Note that this circular orbit passes through the origin of the 
central force when r = 2Rcos@ = 0 


Inserting this trajectory into Binet’s differential orbit equa- R 
tion 11.39 gives 0 
1 d? (cos0)* 1 -1 Miro al 
oR ry + oR (cos#) = 244 (cos 0) E) (a) 


Note that the differential is given by 
d (cos0)* d / sinf N  2sin?0 1 
does 

Inserting this differential into equation a gives 
2 sin? 9 1 1 2 u azl 
— + — + — = — = -8R 0) F(- 
cosó9 — cosg cos@ cos? @ 12 (oa) D 

Thus the radial dependence of the required central force is 


p 2 BRI? 1 k 


cos3Q) cos  cosO 


Circular trajectory passing through the 
origin of the central force. 


~ 8BR3pcosd9 u 5 15 

This corresponds to an attractive central force that depends to the fifth power on the inverse radius r. Note 

that this example is unrealistic since the assumed orbit implies that the potential and kinetic energies are 
T 


infinite when r => 0 at 0 => 3. 


11.6. HAMILTONIAN 255 


11.6 Hamiltonian 


Since the center-of-mass Lagrangian is not an explicit function of time, then 


dH OL 
= m =0 (11.40) 

dt Ot 
Thus the center-of mass Hamiltonian Hem is a constant of motion. However, since the transformation to 
center of mass can be time dependent, then Hem 4 E, that is, it does not include the total energy because 
the kinetic energy of the center-of-mass motion has been omitted from Hem. Also, since no transformation 


is involved, then 


Hem = Tom +U = Eom (11.41) 


That is, the center-of-mass Hamiltonian Hem equals the center-of-mass total energy. The center-of-mass 
Hamiltonian then can be written using the effective potential (11.33) in the form 


2 2 2 2 2 

Pr Po Pr l Pr 
Hem = 5 => = Ue = Eom 11.42 
ae Oo a 5 FOO (11.42) 


It is convenient to express the center-of-mass Hamiltonian Hem in terms of the energy equation for the 
orbit in a central field using the transformed variable u = L, Substituting equations 11.33 and 11.37 into 
the Hamiltonian equation 11.42 gives the energy equation of the orbit 


+U (u7!) = Eem (11.43) 


2 2 
A (11.44) 
2 Qur? 
then 
2 2 
po: = De =p (11.45) 
dt u 2ur? 
The time dependence can be obtained by integration 
t= / ———— A + constant (11.46) 
12 
TEES 


An inversion of this gives the solution in the standard form r = r (t). However, it is more interesting to find 
the relation between r and @. From relation 11.46 for e then 


adr 


dt = (11.47) 
2 
while equation 11.29 gives 
ldt +ld 
dy = TA (11.48) 
r2,/2p (Eom U z2) 
Therefore 
+ldr 
y= + constant (11.49) 
2 


r y% (Em -U - 262) 


which can be used to calculate the angular coordinate. This gives the relation between the radial and angular 
coordinates which specifies the trajectory. 

Although equations (11.45) and (11.49) formally give the solution, the actual solution can be derived 
analytically only for certain specific forms of the force law and these solutions differ for attractive versus 
repulsive interactions. 
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11.7 General features of the orbit solutions 


It is useful to look at the general features of the solutions of the equations of motion given by the equivalent 
one-body representation of the two-body motion. These orbits depend on the net center of mass energy Eem. 
There are five possible situations depending on the center-of-mass total energy Em. 

1) E.m> 0: The trajectory is hyperbolic and has a minimum distance, but no maximum. The distance 
of closest approach is given when 7 = 0. At the turning point Eem = U+ sis 

2) Esm=0 : It can be shown that the orbit for this case is parabolic. 

3) 0 > Eun > Umin : For this case the equivalent orbit has both a maximum and minimum radial distance 
at which 7 = 0. At the turning points the radial kinetic energy term is zero so Eem = U+ sine For the 
attractive inverse square law force the path is an ellipse with the focus at the center of attraction (Figure 
11.5), which is Kepler’s First Law. During the time that the radius ranges from fmin tO Tmax and back the 
radius vector turns through an angle Ay which is given by 


Aw = 2 | > (11.50) 
re ie y% (Fem =i z2) 


The general path prescribes a rosette shape which is a closed curve only if Aw is a rational fraction of 
2T. 
4) Eem= Umin : In this case r is a constant implying that the path is circular since 


dr 2 12 
pa Os ee = 11.51 
a ( U z=) 0 (11.51) 


5) Ecem < Umin : For this case the square root is imaginary and there is no real solution. 

In general the orbit is not closed, and such open orbits do not repeat. Bertrand’s Theorem states that 
the inverse-square central force, and the linear harmonic oscillator, are the only radial dependences of the 
central force that lead to stable closed orbits. 


11.2 Example: Orbit equation of motion for a free body 


It is illustrative to use the differential orbit equation 11.39 to show that 
a body in free motion travels in a straight line. Assume that a line through 
the origin O intersects perpendicular to the instantaneous trajectory at the 
point Q which has polar coordinates (ro, p) relative to the origin. The 
point P, with polar coordinates (r, p), lies on straight line through Q that 
is perpendicular to OQ if, and only if, rcos(@— ô) = ro. Since the force is 
zero then the differential orbit equation simplifies to 


dulo) 
do? 


+u(ó) =0 


A solution of this is 


u(d) = = cos(ó — ô) 


0 


where ro and ô are arbitrary constants. This can be rewritten as 


PO 
cos(¢@ — 6) 


This is the equation of a straight line in polar coordinates as illustrated in the adjacent figure. This shows 
that a free body moves in a straight line if no forces are acting on the body. 


r(p) = 


Trajectory of a free body 
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11.8 Inverse-square, two-body, central force 


The most important conservative, two-body, central interaction is the attractive inverse-square law force, 
which is encountered in both gravitational attraction and the Coulomb force. This force F(r) can be written 
in the form 


k- 
F(r) = GT (11.52) 
The force constant k is defined to be negative for an attractive force and positive for a repulsive force. In 
S.I. units the force constant k = —Gm mz for the gravitational force and k = jae for the Coulomb force. 


Note that this sign convention is the opposite of what is used in many books which use a negative sign in 
equation 11.52 and assume k to be positive for an attractive force and negative for a repulsive force. 

The conservative, inverse-square, two-body, central force is unique in that the underlying symmetries 
lead to four conservation laws, all of which are of pivotal importance in nature. 


1. Conservation of angular momentum: Like all conservative central forces, the inverse-square cen- 
tral two-body force conserves angular momentum as proven in chapter 11.3. 


2. Conservation of energy: This conservative central force can be represented in terms of a scalar 
potential energy U(r) as given by equation 11.2, where for this central force 


k 
U(r) =-— (11.53) 
r 
Moreover, equation 11.42 showed that the center-of-mass Hamiltonian is conserved, that is, Hem = Eom 


3. Gauss’ Law: For a conservative, inverse-square, two-body, central force, the flux of the force field out 
of any closed surface is proportional to the algebraic sum of the sources and sinks of this field that 
are located inside the closed surface. The net flux is independent of the distribution of the sources 
and sinks inside the closed surface, as well as the size and shape of the closed surface. Chapter 2.14.5 
proved this for the gravitational force field. 


4. Closed orbits: Two bodies interacting via the conservative, inverse-square, two-body, central force 
follow closed (degenerate) orbits as stated by Bertrand’s Theorem. The first consequence of this 
symmetry is that Kepler’s laws of planetary motion have stable, single-valued orbits. The second 
consequence of this symmetry is the conservation of the eccentricity vector discussed in chapter 11.84. 


Observables that depend on Gauss’s Law, or on closed planetary orbits, are extremely sensitive to addition 
of even a miniscule incremental exponent € to the radial dependence r (+9 of the force. The statement 
that the inverse-square, two-body, central force leads to closed orbits can be proven by inserting equation 
11.52 into the orbit differential equation, 


Using the transformation 
2 pk 
y=u+ T (11.55) 
the orbit equation becomes 
d2 
ZY +y=0 (11.56) 
dip 
A solution of this equation is 
y = B cos (1 — Yo) (11.57) 
Therefore 1 k 
p 
= e [1 + ecos (Y — vp) (11.58) 


This the equation of a conic section. For an attractive, inverse-square, central force, equation 11.58 is the 
equation for an ellipse with the origin of r at one of the foci of the ellipse that has eccentricity e, defined as 


e= B— (11.59) 
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Equation 11.58 is the polar equation of a conic section. Equation 11.58 also can be derived with the 
origin at a focus by inserting the inverse square law potential into equation 11.49 which gives 


+d 
y= 7 + constant (11.60) 


The solution of this gives 


a cos (y — wo) (11.61) 


Equations 11.58 and 11.61 are identical if the eccentricity € equals 


2Eeml? 


E€ = bee 


(11.62) 


The value of po merely determines the orientation of the major axis of the equivalent orbit. Without loss of 
generality, it is possible to assume that the angle 7 is measured with respect to the major axis of the orbit, 
that is Y, = 0. Then the equation can be written as 


2E en l? 
uk? 


u=2=-£ [Lt ecos (yy = -4 + 1+ 


cos «| (11.63) 
This is the equation of a conic section where e is the eccentricity of the conic section. The conic section is a 
hyperbola if e > 1, parabola if e = 1, ellipse if e < 1, and a circle if e = 0. All the equivalent one-body orbits 
for an attractive force have the origin of the force at a focus of the conic section. The orbits depend on 
whether the force is attractive or repulsive, on the conserved angular momentum l, and on the center-of-mass 
energy Eem. 


11.8.1 Bound orbits 


Closed bound orbits occur only if the following requirements 
are satisfied. 


1. The force must be attractive, (k < 0) then equation 
11.63 ensures that r is positive. 


2. For a closed elliptical orbit. the eccentricity e < 1 of the 
equivalent one-body representation of the orbit implies 
that the total center-of-mass energy Eem < 0, that is, 
the closed orbit is bound. 


Bound elliptical orbits have the center-of-force at one in- 
terior focus F of the elliptical one-body representation of the 
orbit as shown in figure 11.5. 

The minimum value of the orbit r = Tmin occurs when 
a = 0, where 


2 . A . . . 
o l (11.64) Figure 11.5: Bound elliptical orbit. 
pk [1 + e] 


Tmin = 


This minimum distance is called the periapsis'. 


The greek term apsis refers to the points of greatest or least distance of approach for an orbiting body from one of the 
foci of the elliptical orbit. The term periapsis or pericenter both are used to designate the closest distance of approach, while 
apoapsis or apocenter are used to designate the farthest distance of approach. Attaching the terms "perí-" and "apo-" to the 
general term "-apsis" is preferred over having different names for each object in the solar system. For example, frequently used 
terms are "-helion" for orbits of the sun, "-gee" for orbits around the earth, and "-cynthion" for orbits around the moon. 
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The maximum distance, r = Tmax, Which is called the apoapsis, occurs when y = 180° 


Ê 


max — PTA 11.65 
r pk [1 — e] vo 


Remember that since k < 0 for bound orbits, the negative signs in equations 11.64 and 11.65 lead to r > 0. 


The most bound orbit is a circle having e = 0 which implies that Eem = aa 


The shape of the elliptical orbit also can be described with respect to the center of the elliptical equivalent 
orbit by deriving the lengths of the semi-major axis a and the semi-minor axis b shown in figure 11.5. 


E cs ae St A: ce Se (11.66) 
a= 5) Y min T Tmax) = 2 pk [1 + el pk [1 — e E pk [1 — e] 
2 
PT oae (11.67) 
pky/[1 -— e°] 


Remember that the predicted bound elliptical orbit corresponds to the equivalent one-body representation 
for the two-body motion as illustrated in figure 11.2. This can be transformed to the individual spatial 
trajectories of the each of the two bodies in an inertial frame. 


11.8.2 Kepler’s laws for bound planetary motion 


Kepler’s three laws of motion apply to the motion of two bodies in a bound orbit due to the attractive 
gravitational force for which k = —Gm mg. 


1) Each planet moves in an elliptical orbit with the sun at one focus 
2) The radius vector, drawn from the sun to a planet, describes equal areas in equal times 
3) The square of the period of revolution about the sun is proportional to the cube of the major axis 


of the orbit. 

Two bodies interacting via the gravitational force, which is a conservative, inverse-square, two-body 
central force, is best handled using the equivalent orbit representation. The first and second laws were 
proved in chapters 11.8 and 11.3. That is, the second law is equivalent to the statement that the angular 
momentum is conserved. The third law can be derived using the fact that the area of an ellipse is 


los 
A = rab = ra? V1 — è = ——a? (11.68) 
/=uk 
Equations 11.26 and 11.27 give that the rate of change of area swept out by the radius vector is 
dA 1: l 
= = 11. 
Ta O (iep) 


1 
A 3 
T= =m (£) a? (11.70) 
(a) E 
This leads to Kepler’s 37” law 
P= a (11.71) 
Bound orbits occur only for attractive forces for which the force constant k is negative, and thus cancel 
the negative sign in equation 11.71. For example, for the gravitational force k = —Gm mz. 
Note that the reduced mass u = ae occurs in Kepler’s 371 law. That is, Kepler’s third law can be 


written in terms of the actual masses of the bodies to be 


An? ‘ 
2 3 
T! = 4 11.72 
G (mi + ma) ( ) 
In relating the relative periods of the different planets Kepler made the approximation that the mass of the 
planet m; is negligible relative to the mass of the sun ma. 
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The eccentricity of the major planets ranges from e = 0.2056 for Mercury, to e = 0.0068 for Venus. The 
Earth has an eccentricity of e = 0.0167 with rmin = 91- 10% miles and rmax = 95 - 10% miles. On the other 
hand, e = 0.967 for Halley’s comet, that is, the radius vector ranges from 0.6 to 18 times the radius of the 
orbit of the Earth. 

The orbit energy can be derived by substituting the eccentricity, given by equation 11.62, into the semi- 
major axis length a, given by equation 11.66, which leads to the center-of-mass energy of 


k 
Eun === 11.73 
F (11.73) 
However, the Hamiltonian, given by equation 11.42, implies that Eem is 
1 k k 
Eom = su? + |-2) === 11.74 
aes ( =) 2a EM) 
For the simple case of a circular orbit, a = r then the velocity v equals 
k 
=,/— 11.75 
v T (11.75) 


For a circular orbit, the drag on a satellite lowers the total energy resulting in a decrease in the radius 
of the orbit and a concomitant increase in velocity. That is, when the orbit radius is decreased, part of the 
gain in potential energy accounts for the work done against the drag, and the remaining part goes towards 
increase of the kinetic energy. Also note that, as predicted by the Virial Theorem, the kinetic energy always 
is half the potential energy for the inverse square law force. 


11.8.3 Unbound orbits 


Attractive inverse-square central forces lead to hyperbolic 
orbits for € > 1 for which Eem > 0, that is, the orbit is 
unbound. In addition, the orbits always are unbound for 
a repulsive force since U = E is positive as is the kinetic 
energy Tem, thus Eem = Tem + Uem > 0. The radial orbit 
equation for either an attractive or a repulsive force is 


Ê 


~ nk [L + ecosy] ae 


For a repulsive force k is positive and 1? always is positive. 
Therefore to ensure that r remain positive the bracket term 
must be negative. That is } 


[1 + ecosy] < 0 k>0 (11.77) 


b 


For an attractive force k is negative and since 1? is positive 
then the bracket term must be positive to ensure that r is 
positive. That is, 


[1 + ecos y] > 0 k<0 (11.78) 


Figure 11.6 shows both branches of the hyperbola for a given 
angle w for the equivalent two-body orbits where the center 
of force is at the origin. For an attractive force, k < 0, 
the center of force is at the interior focus of the hyperbola, 
whereas for a repulsive force the center of force is at the 
exterior focus. For a given value of || the asymptotes of the 
orbits both are displaced by the same impact parameter 
b from parallel lines passing through the center of force. 
The scattering angle, between the outgoing direction of the 
scattered body and the incident direction, is designated to 
be 0, which is related to the angle Y% by 0 = 180° — 24. 


Figure 11.6: Hyperbolic two-body orbits for a 
repulsive (left) and attractive (right) inverse- 
square, central two-body forces. Both orbits 
have the angular momentum vector pointing 
upwards out of the plane of the orbit 
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11.8.4 Eccentricity vector 


Two-bodies interacting via a conservative two-body central force have two invariant first-order integrals, 
namely the conservation of energy and the conservation of angular momentum. For the special case of the 
inverse-square law, there is a third invariant of the motion, which Hamilton called the eccentricity vector’, 
that unambiguously defines the orientation and direction of the major axis of the elliptical orbit. It will be 
shown that the angular momentum plus the eccentricity vector completely define the plane and orientation 
of the orbit for a conservative inverse-square law central force. 

Newton’s second law for a central force can be written in the form 


p =f(r)î (11.79) 


Note that the angular moment L = r x p is conserved for a central force, that is L = 0. Therefore the time 
derivative of the product p x L reduces to 


£ (p x L)= p x L=f(r)#x (rx ut) = f(r)= [r (1-4) — rêi] (11.80) 


This can be simplified using the fact that 


Ll Ss 
rei=5G (1-1) =rr (11.81) 
thus ; ; 
£ epee EE Bi Fas ee | nek eE 
f(y keD- = aro E-E] = -wroeg CE) (11.82) 
This allows equation 11.80 to be reduced to 
d B od yr 
q PX Deir (=) (11.83) 


Assume the special case of the inverse-square law, equation 11.52, then the central force equation 11.83 
reduces to 


d d 
— (p x L)= —— (ukt 11.84 
© (px L)= -4 (uk) (11.84) 
or d 
T [(p x L) + (ukî)] = 0 (11.85) 
Define the eccentricity vector A as 
A =(p x L)+ (uk?) (11.86) 
then equation 11.85 corresponds to 
dA 
— =0 11.87 
T (11.87) 
This is a statement that the eccentricity vector A is a constant of motion for an inverse-square, central 


force. 
The definition of the eccentricity vector A and angular momentum vector L implies a zero scalar product, 


A-L=0 (11.88) 


Thus the eccentricity vector A and angular momentum L are mutually perpendicular, that is, A is in the 
plane of the orbit while L is perpendicular to the plane of the orbit. The eccentricity vector A, always points 
along the major axis of the ellipse from the focus to the periapsis as illustrated on the left side in figure 11.7. 


2The symmetry underlying the eccentricity vector is less intuitive than the energy or angular momentum invariants leading 
to it being discovered independently several times during the past three centuries. Jakob Hermann was the first to indentify 
this invariant for the special case of the inverse-square central force. Bernoulli generalized his proof in 1710. Laplace derived 
the invariant at the end of the 18%” century using analytical mechanics. Hamilton derived the connection between the invariant 
and the orbit eccentricity. Gibbs derived the invariant using vector analysis. Runge published the Gibb’s derivation in his 
textbook which was referenced by Lenz in a 1924 paper on the quantal model of the hydrogen atom. Goldstein named this 
invariant the "Laplace-Runge-Lenz vector", while others have named it the "Runge-Lenz vector" or the "Lenz vector". This 
book uses Hamilton’s more intuitive name of "eccentricity vector". 
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Py 


1 Px 


Figure 11.7: The elliptical trajectory and eccentricity vector A for two bodies interacting via the inverse- 
square, central force for eccentricity € = 0.75. The left plot shows the elliptical spatial trajectory where 
the semi-major axis is assumed to be on the z-axis and the angular momentum L =1Z, is out of the page. 
The force centre is at one foci of the ellipse. The vector coupling relation A = (p x L) + (uk?) is illustrated 
at four points on the spatial trajectory. The right plot is a hodograph of the linear momentum p for this 
trajectory. The periapsis is denoted by the number 1 and the apoapsis is marked as 3 on both plots. Note 
that the eccentricity vector A is a constant that points parallel to the major axis towards the perapsis. 


As a consequence, the two orthogonal vectors A and L completely define the plane of the orbit, plus the 
orientation of the major axis of the Kepler orbit, in this plane. The three vectors A, p x L, and (ukî) obey 
the triangle rule as illustrated in the left side of figure 11.7. 

Hamilton noted the direct connection between the eccentricity vector A and the eccentricity e of the 
conic section orbit. This can be shown by considering the scalar product 


A-r=Arcosy = r- (p x L) + ukr (11.89) 
Note that the triple scalar product can be permuted to give 
r (pxL)=(rxp)L=L.L =? (11.90) 


Inserting equation 11.90 into 11.89 gives 


Beh (1 oso) (11.91) 


Note that equations 11.63 and 11.91 are identical if Yọ = 0. This implies that the eccentricity e and A 
are related by 


Ge (11.92) 


where k is defined to be negative for an attractive force. The relation between the eccentricity and total 
center-of-mass energy can be used to rewrite equation 11.62 in the form 


A? = pk? + 24 E cml? (11.93) 


The combination of the eccentricity vector A and the angular momentum vector L completely specifies 
the orbit for an inverse square-law central force. The trajectory is in the plane perpendicular to the angu- 
lar momentum vector L, while the eccentricity, plus the orientation of the orbit, both are defined by the 
eccentricity vector A. The eccentricity vector and angular momentum vector each have three independent 
coordinates, that is, these two vector invariants provide six constraints, while the scalar invariant energy E, 
adds one additional constraint. The exact location of the particle moving along the trajectory is not defined 
and thus there are only five independent coordinates governed by the above seven constraints. Thus the 
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eccentricity vector, angular momentum, and center-of-mass energy are related by the two equations 11.88 
and 11.93. 

Noether’s theorem states that each conservation law is a manifestation of an underlying symmetry. 
Identification of the underlying symmetry responsible for the conservation of the eccentricity vector A is 
elucidated using equation 11.86 to give 


(uki) = A- (p x L) (11.94) 


Take the scalar product 
(kt) - (uk®) = (uk)? = pL? + A? — 2L. (px L) (11.95) 


Choose the angular momentum to be along the z-axis, that is, L =/Z, and, since p and A are perpendicular 
to L, then p and A are in the X — y plane. Assume that the semimajor axis of the elliptical orbit is along 
the x-axis, then the locus of the momentum vector on a momentum hodograph has the equation 


po + (0, - al = (2) (11.96) 


Equation 11.96 implies that the locus of the momentum vector is a circle of radius |l with the center 


displaced from the origin at coordinates (0, 4) as shown by the momentum hodograph on the right side of 
an figure 11.7. The angle 8 and eccentricity e are related by, 


A/L A 
a ae ee 11. 
cos P Aa He € (11.97) 
The circular orbit is centered at the origin for e = — + = 0, and thus the magnitude |p| is a constant around 


the whole trajectory. 

The inverse-square, central, two-body, force is unusual in that it leads to stable closed bound orbits 
because the radial and angular frequencies are degenerate, i.e. Wr = w,,. In momentum space, the locus of 
the linear momentum vector p is a perfect circle which is the underlying symmetry responsible for both the 
fact that the orbits are closed, and the invariance of the eccentricity vector. Mathematically this symmetry 
for the Kepler problem corresponds to the body moving freely on the boundary of a four-dimensional sphere 
in space and momentum. The invariance of the eccentricity vector is a manifestation of the special property 
of the inverse-square, central force under certain rotations in this four-dimensional space; this O(4) symmetry 
is an example of a hidden symmetry. 


11.9 Isotropic, linear, two-body, central force 


Wa 


Closed orbits occur for the two-dimensional linear oscillator when = isa rational fraction as discussed in 


chapter 3.3. Bertrand’s Theorem states that the linear oscillator, and the inverse-square law (Kepler 
problem), are the only two-body central forces that have single-valued, stable, closed orbits of the coupled 
radial and angular motion. The invariance of the eccentricity vector was the underlying symmetry leading 
to single-valued, stable, closed orbits for the Kepler problem. It is interesting to explore the symmetry that 
leads to stable closed orbits for the harmonic oscillator. For simplicity, this discussion will restrict discussion 
to the isotropic, harmonic, two-body, central force where wz = wy = w, for which the two-body, central force 
is linear 


F(r) = kr (11.98) 
where k > 0 corresponds to a repulsive force and k < 0 to an attractive force. This isotropic harmonic force 
can be expressed in terms of a spherical potential U(r) where 


U(r) = Gh (11.99) 


Since this is a central two-body force, both the equivalent one-body representation, and the conservation 
of angular momentum, are equally applicable to the harmonic two-body force. As discussed in section 
11.3, since the two-body force is central, the motion is confined to a plane, and thus the Lagrangian can 
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be expressed in polar coordinates. In addition, since the force is spherically symmetric, then the angular 
momentum is conserved. The orbit solutions are conic sections as described in chapter 11.7. The shape of 
the orbit for the harmonic two-body central force can be derived using either polar or cartesian coordinates 
as illustrated below. 


11.9.1 Polar coordinates 


The origin of the equivalent orbit for the harmonic force will be found to be at the center of an ellipse, rather 
than the foci of the ellipse as found for the inverse square law. The shape of the orbit can be defined using 
a Binet differential orbit equation that employs the transformation 


u = (11.100) 
r 
Then a) Bed 
u r 
A e E 11.101 
dy r3 dy ( ) 
The chain rule gives that 
dr. r? . du r py du’ 
P = = = 11.102 
CS a as ah ( ) 
Substitute this into the Hamiltonian Hem, equation 11.42, gives 
1 a 1P Paw ye Py, k 
= = E — u + — 11.103 
a 8 uu (= 2 + 24 ( ) 
Rearranging this equation gives 
du! \* 8E 4k 
(=) Hay ey Se (11.104) 
dy Py Py 
Addition of a constant to both sides of the equation completes the square 
d E f E ? 4k E : 
Shoat) a a | ee a E (11.105) 
dy Py Py Py Py 


The right-hand side of equation 11.105 is a constant. The solution of 11.105 must be a sine or cosine function 
with polar angle Y = wt. That is 


1 
2 25 
E E k 
w- 2E) =] 2E) +] cos2(h— bp) (11.106) 
Py Py Py 
That is, 
2\2 
1 Ep kpi, 
a A ral Gt aie ee Aa) — 11.107 
Eo A +( + Py) (Y — o) ( ) 


Equation 11.107 corresponds to a closed orbit centered at the origin of the elliptical orbit as illustrated in 
figure 11.8. The eccentricity e of this closed orbit is given by 


1 
kp? á e 
Y 
E 11.1 
(: + me) 22 (11.108) 
Equations 11.66, 11.67 give that the eccentricity is related to the semi-major a and semi-minor b axes by 
b\2 
e=1- (2) (11.109) 
a 


Note that for a repulsive force k > 0, then e > 1 leading to unbound hyperbolic or parabolic orbits centered 
on the origin. An attractive force, k < 0, allows for bound elliptical, as well as unbound parabolic and 
hyperbolic orbits. 
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Figure 11.8: The elliptical equivalent trajectory for two bodies interacting via the linear, central force for 
eccentricity e = 0.75. The left plot shows the elliptical spatial trajectory where the semi-major axis is 
assumed to be on the z-axis and the angular momentum L =/Z, is out of the page. The force center is at 
the center of the ellipse. The right plot is a hodograph of the linear momentum p for this trajectory. 


11.9.2 Cartesian coordinates 


The isotropic harmonic oscillator, expressed in terms of cartesian coordinates in the (x, y) plane of the orbit, 
is separable because there is no direct coupling term between the x and y motion. That is. the center-of-mass 
Lagrangian in the (x,y) plane separates into independent motion for x and y. 


1 1 1 1 1 1 
L= spb t+ krere ES + ha + ES + sh | (11.110) 


Solutions for the independent coordinates, and their corresponding momenta, are 


r = ?Acos(wt+a)+ JB cos (wt + p) (11.111) 
p = —tApwsin (wt +a) — jByw sin (wt + 8) (11.112) 
where w = yE . Therefore 
r? = x? +y? =[Acos(wt + a)? + [B cos (wt + 8)? (11.113) 
A? + B? A*t + Bt + 2AB? cos (a — 
= BBS a A BE OAR CONG 0) (Qwt + Yo) 
2 2 
where 42 : 
B 4 
cosa cosa + B* cos 8 (11.114) 
yv At + Bt + 2AB? cos (a — 6) 
For a phase difference a — 8 = +4, this equation describes an ellipse centered at the origin which agrees 


with equation 11.107 that was derived using polar coordinates. 

The two normal modes of the isotropic harmonic oscillator are degenerate, therefore x, y are equally good 
normal modes with two corresponding total energies, E, Ez, while the corresponding angular momentum J 
points in the z direction. 


2 
1 
BE, = P pike (11.115) 
2u 2 
2 
1 
Es 4 Shy? (11.116) 
pu 2 


J =  p(tpy— ype) (11.117) 
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Figure 11.8 shows the closed elliptical equivalent orbit plus the corresponding momentum hodograph for 
the isotropic harmonic two-body central force. Figures 11.7 and 11.8 contrast the differences between the 
elliptical orbits for the inverse-square force, and those for the harmonic two-body central force. Although 
the orbits for bound systems with the harmonic two-body force, and the inverse-square force, both lead to 
elliptical bound orbits, there are important differences. Both the radial motion and momentum are two 
valued per cycle for the reflection-symmetric harmonic oscillator, whereas the radius and momentum have 
only one maximum and one minimum per revolution for the inverse-square law. Although the inverse-square, 
and the isotropic, harmonic, two-body central forces both lead to closed bound elliptical orbits for which the 
angular momentum is conserved and the orbits are planar, there is another important difference between the 
orbits for these two interactions. The orbit equation for the Kepler problem is expressed with respect to a 
foci of the elliptical equivalent orbit, as illustrated in figure 11.7, whereas the orbit equation for the isotropic 
harmonic oscillator orbit is expressed with respect to the center of the ellipse as illustrated in figure 11.8. 


11.9.3 Symmetry tensor A’ 


The invariant vectors L and A provide a complete specification of the geometry of the bound orbits for 
the inverse square-law Kepler system. It is interesting to search for a similar invariant that fully specifies 
the orbits for the isotropic harmonic central force. In contrast to the Kepler problem, the harmonic force 
center is at the center of the elliptical orbit, and the orbit is reflection symmetric with the radial and angular 
frequencies related by w, = 2wy. Since the orbit is reflection-symmetric, the orientation of the major axis 
of the orbit cannot be uniquely specified by a vector. Therefore, for the harmonic interaction it is necessary 
to specify the orientation of the principal axis by the symmetry tensor. The symmetry of the isotropic 
harmonic, two-body, central force leads to the symmetry tensor A’, which is an invariant of the motion 
analogous to the eccentricity vector A. Like a rotation matrix, the symmetry tensor defines the orientation, 
but not direction, of the major principal axis of the elliptical orbit. In the plane of the polar orbit the 3 x 3 
symmetry tensor A’ reduces to a 2 x 2 matrix having matrix elements defined to be, 


1 _ PiPj 


1 


The diagonal matrix elements Aj, = E, and 45, = Ez which are constants of motion. The off-diagonal 
term is given by 


z 1 o ack ql be 1 kJ? 
A ES + hay) = (E + ha) (A + shy — 4u (apy — ype)” = E E2 — 18 (11.119) 


The terms on the right-hand side of equation 11.119 all are constants of motion, therefore A’, also is a 
constant of motion. Thus the 3 x 3 symmetry tensor A’ can be reduced to a 2 x 2 symmetry tensor for which 
all the matrix elements are constants of motion, and the trace of the symmetry tensor is equal to the total 
energy. 

In summary, the inverse-square, and harmonic oscillator two-body central interactions both lead to closed, 
elliptical equivalent orbits, the plane of which is perpendicular to the conserved angular momentum vector. 
However, for the inverse-square force, the origin of the equivalent orbit is at the focus of the ellipse and 
Wr = We, whereas the origin is at the center of the ellipse and wp = 2w¢ for the harmonic force. As a 
consequence, the elliptical orbit is reflection symmetric for the harmonic force but not for the inverse square 
force. The eccentricity vector and symmetry tensor both specify the major axes of these elliptical orbits, 
the plane of which are perpendicular to the angular momentum vector. The eccentricity vector, and the 
symmetry tensor, both are directly related to the eccentricity of the orbit and the total energy of the two- 
body system. Noether’s theorem states that the invariance of the eccentricity vector and symmetry tensor, 
plus the corresponding closed orbits, are manifestations of underlying symmetries. The dynamical SU3 
symmetry underlies the invariance of the symmetry tensor, whereas the dynamical O4 symmetry underlies 
the invariance of the eccentricity vector. These symmetries lead to stable closed elliptical bound orbits only 
for these two specific two-body central forces, and not for other two-body central forces. 
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11.10 Closed-orbit stability 


Bertrand’s theorem states that the linear oscillator and 
the inverse-square law are the only two-body, central 
forces for which all bound orbits are single-valued, and 
stable closed orbits. The stability of closed orbits can 
be illustrated by studying their response to perturba- 
tions. For simplicity, the following discussion of stabil- 
ity will focus on circular orbits, but the general prin- 
ciples are the same for elliptical orbits. 

A circular orbit occurs whenever the attractive 
force just balances the effective ” centrifugal force” in 
the rotating frame. This can occur for any radial func- 
tional form for the central force. The effective poten- 
tial, equation 11.33 will have a stationary point when 


Wee 
ee = 11.12 
( Or ae j l E 


that is, when 


2 
(=) 2 = =0 (11.121) 
Or r=ro TG 


This is equivalent to the statement that the net force 
is zero. Since the central attractive force is given by 


F(r) = El (11.122) 


then the stationary point occurs when 
2 +2 
F(ro) = -~—3 = -uroy (11.123) 


pro 


This is the so-called centrifugal force in the rotating 
frame. The Hamiltonian, equation 11.44, gives that 


i 2 12 
AN Be =e (11.124) 
u 2ur? 


For a circular orbit + = 0 that is 


[? 
Ecm = U - — 11.12 
cm U Qur2 ( 5) 


A stable circular orbit is possible if both equations 
(11.121) and (11.125) are satisfied. Such a circular 
orbit will be a stable orbit at the minimum when 


LU, 
(=) >0 (11.126) 
rT=T0 


Examples of stable and unstable orbits are shown in 
figure 11.9. 
Stability of a circular orbit requires that 


aS ae 
ðr? T=TO 
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=“ 2 3 4 


Figure 11.9: Stable and unstable effective central 
potentials. The repulsive centrifugal and the attrac- 
tive potentials (k<0) are shown dashed. The solid 
curve is the effective potential. 


= 50 (11.127) 
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which can be written in terms of the central force for a stable orbit as 


(=) yO) (11.128) 
Or a To 


If the attractive central force can be expressed as a power law 


F(r) =—kr" (11.129) 
then stability requires 
kr (3 +n) >0 (11.130) 
or 
n>-—3 (11.131) 


Stable equivalent orbits will undergo oscillations about the stable orbit if perturbed. To first order, the 
restoring force on a bound reduced mass p is given by 


2 
Prestore eel ad Uess (r r ro) = pr (11.132) 
a A 


To the extent that this linear restoring force dominates over higher-order terms, then a perturbation of the 
stable orbit will undergo simple harmonic oscillations about the stable orbit with angular frequency 


(11.133) 


The above discussion shows that a small amplitude radial oscillation about the stable orbit with amplitude 
€ will be of the form 
E = Asin(2rwt + ô) 


The orbit will be closed if the product of the oscillation frequency w, and the orbit period 7 is an integer 
value. 

The fact that planetary orbits in the gravitational field are observed to be closed is strong evidence 
that the gravitational force field must obey the inverse square law. Actually there are small precessions of 
planetary orbits due to perturbations of the gravitational field by bodies other than the sun, and due to 
relativistic effects. Also the gravitational field near the earth departs slightly from the inverse square law 
because the earth is not a perfect sphere, and the field does not have perfect spherical symmetry. The study 
of the precession of satellites around the earth has been used to determine the oblate quadrupole and slight 
octupole (pear shape) distortion of the shape of the earth. 

The most famous test of the inverse square law for gravitation is the precession of the perihelion of 
Mercury. If the attractive force experienced by Mercury is of the form 


m 

F(r) = oe 

where |a| is small, then it can be shown that, for approximate circular orbitals, the perihelion will advance 
by a small angle ra: per orbit period. That is, the precession is zero if a = 0, corresponding to an inverse 
square law dependence which agrees with Bertrand’s theorem. The position of the perihelion of Mercury has 
been measured with great accuracy showing that, after correcting for all known perturbations, the perihelion 
advances by 43(+5) seconds of arc per century, that is 5 x 1077 radians per revolution. This corresponds to 
a = 1.6 x 1077 which is small but still significant. This precession remained a puzzle for many years until 
1915 when Einstein predicted that one consequence of his general theory of relativity is that the planetary 
orbit of Mercury should precess at 43 seconds of arc per century, which is in remarkable agreement with 
observations. 
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11.3 Example: Linear two-body restoring force 


The effective potential for a linear two-body restoring force F = —kr is 
2 ? 
Uerp = =k 
apra A 2ür? 


At the minimum 


Thus 


Note that this is twice the frequency for the planar harmonic oscillator with the same restoring coefficient. 
This is due to the central repulsion, the effective potential well for this rotating oscillator example has about 
half the width for the corresponding planar harmonic oscillator. Note that the kinetic energy for the rotational 
motion, which is ee. equals the potential energy tkr? at the minimum as predicted by the Virial Theorem 
for a linear two-body restoring force. 


11.4 Example: Inverse square law attractive force 


The effective potential for an inverse square law restoring force F = — 5A, where k is assumed to be 
positive, 
k 12 
Vert === + == 
ax! r a Qur? 
At the minimum 
Werf k P o 0 
OT ae Or ps 
Thus 
Ê 
To = uk 
and 7 a 
d*U, 3l 2k ok; 
ft = a a ack 
dr? Jor PIO rò rò 


which is a stable orbit. Small perturbations about such a stable circular orbit will have an angular frequency 


The kinetic energy for oscillations about this stable circular orbit, which is shan, equals half the magnitude 


of the potential energy -E at the minimum as predicted by the Virial Theorem. 
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11.5 Example: Attractive inverse cubic central force 


The inverse cubic force is an interesting example to investigate the stability of the orbit equations. One 
solution of the inverse cubic central force, for a reduced mass u, is a spiral orbit 


r = roe?” 


That this is true can be shown by inserting this orbit into the differential orbit equation. 
Using a Binet transformation of the variable r to u gives 


u = a= Leov 
P ro 
du _ a 
dy To 
A 
dy? ro 
Substituting these into the differential equation of the orbit 
du H 
gives 
2 
EL L gow = -Ereet p E 
ro ro 12 u 
That is 


2 2 2 2 
F(z) a (a? +1) 1 E — _ (2 2o 
H pr 
which is a central attractive inverse cubic force. 


The time dependence of the spiral orbit can be derived since the angular momentum gives 


. l l 
be 


pr = priezor 
This can be written as 1 
e2 dap = —dt 
HTO 
Integrating gives 
AE Ne 
e 
a rg tP 
where B is a constant. But the orbit gives 
2alt 


ramet? = “Fe + 2a8 


Thus the radius increases or decreases as the square root of the time. That is, an attractive cubic central force 
does not have a stable orbit which is what is expected since there is no minimum in the effective potential 
energy. Note that it is obvious that there will be no minimum or maximum for the summation of effective 
potential energy since, if the force is F = —4, then the effective potential energy is 


k 2 12 1 
é = = k 
Ves; 212 E 2ur? 6 ) 212 


which has no stable minimum or maximum. 
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11.6 Example: Spiralling mass attached by a string to a hanging mass 


An example of an application of orbit stability is the case shown in the adjacent figure. A particle of 
mass m moves on a horizontal frictionless table. This mass is attached by a light string of fixed length b and 
rotates about a hole in the table. The string is attached to a second equal mass m that is hanging vertically 


downwards with no angular motion. 


The equations are most conveniently expressed in cylindrical 
coordinates (r,0,z) with the origin at the hole in the table, and z 


vertically upward. The fixed length of the string requires z = r—b. 


The potential energy is 


U = mgz = mg(r — b) 


The system is central and conservative, thus the Hamiltonian 


can be written as 


H= > (5? +76”) + Si + mg(r—b) = E 


The Lagrangian is independent of 0, that is, 0 is cyclic, thus the 
angular momentum mr?0 = l is a constant of motion. Substi- 


tuting this into the Hamiltonian equation gives 
2 


mr + 5 
2mr 


+mg(r—-b)=E 


The effective potential is 
[2 
2mr2 


Ueff = 


Rotating mass m on a frictionless 
horizontal table connected to a 
suspended mass m. 


+mglr — b) 


which is shown in the adjacent figure. The stationary value occurs when 


Wes f O 2 
(St) qomo 


That is, when the angular momentum is related to the radius by 


ES m?gré 
Note that ro = 0 if 1=0. 
The stability of the solution is given by the second deriv- 


ative 
(52) o 3P 3mg 


= = >0 
Or? mré ro 


Therefore the stationary point is stable. 
Note that the equation of motion for the minimum can be 
expressed in terms of the restoring force on the two masses 


Thus the system undergoes harmonic oscillation with fre- 


quency 
3 
ope a, E 
2m 2ro 


The solution of this system is stable and undergoes simple 
harmonic motion. 


U(r) 


P 


Effective potential for two connected masses. 
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11.11 The three-body problem 


Two bodies interacting via conservative central forces can be 
solved analytically for the inverse square law and the Hooke's law 
radial dependences as already discussed. Central forces that have 
other radial dependences for the equations of motion may not be 
expressible in terms of simple functions, nevertheless the motion 
always can be given in terms of an integral. For a gravitational 
system comprising n > 3 bodies that are interacting via the two- 
body central gravitational force, then the equations of motion 
can be written as 


n 
5 qk — qj 
mä = G Y mm E) 3) 
k lax — 95! 
k 
Fj 


Even when all the n bodies are interacting via two-body central 
forces, the problem usually is insoluble in terms of known ana- 
lytic integrals. Newton first posed the difficulty of the three-body 
Kepler problem which has been studied extensively by mathe- 
maticians and physicists. No known general analytic integral Figure 11.10: A contour plot of the effec- 
solution has been found. Each body for the n-body system has tive potential for the Sun-Earth gravita- 
6 degrees of freedom, that is, 3 for position and 3 for momen- tional system in the rotating frame where 
tum. The center-of-mass motion can be factored out, therefore the Sun and Earth are stationary. The 
the center-of-mass system for the n-body system has 6n — 10 de- 5 Lagrange points L; are saddle points 
grees of freedom after subtraction of 3 degrees for location of the where the net force is zero. (Figure cre- 
center of mass, 3 for the linear momentum of the center of mass, ated by NASA) 
3 for rotation of the center of mass, and 1 for the total energy of 
the system. Thus for n = 2 there are 12 — 10 = 2 degrees of freedom for the two-body system for which the 
Kepler approach takes to be r and @. For n = 3 there are 8 degrees of freedom in the center of mass system 
that have to be determined. 

Numerical solutions to the three-body problem can be obtained using successive approximation or per- 
turbation methods in computer calculations. The problem can be simplified by restricting the motion to 
either of following two approximations: 


1) Planar approximation 


This approximation assumes that the three masses move in the same plane, that is, the number of degrees 
of freedom are reduced from 8 to 6 which simplifies the numerical solution. 


2) Restricted three-body approximation 


The restricted three-body approximation assumes that two of the masses are large and bound while the 
third mass is negligible such that the perturbation of the motion of the larger two by the third body is 
negligible. This approximation essentially reduces the system to a two body problem in order to calculate 
the gravitational fields that act on the third much lighter mass. 

Euler and Lagrange showed that the restricted three-body system has five points at which the combined 
gravitational attraction plus centripetal force of the two large bodies cancel. These are called the Lagrange 
points and are used for parking satellites in stable orbits with respect to the Earth-Moon system, or with 
respect to the Sun-Earth system. Figure 11.10 illustrates the five Lagrange points for the Earth-Sun system. 
Only two of the Lagrange points, L4 and Ls lead to stable orbits. Note that these Lagrange points are fixed 
with respect to the Earth-Sun system which rotates with respect to inertial coordinate frames. The 1900’s 
discovery of the Trojan asteroids at the L4 and Ls Lagrange points of the Sun-Jupiter system confirmed the 
Lagrange predictions. 

Poincaré showed that the motion of a light mass bound to two heavy bodies can exhibit extreme sensitivity 
to initial conditions as well as characteristics of chaos. Solution of the three-body problem has remained a 
largely unsolved problem since Newton identified the difficulties involved. 
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11.12 Two-body scattering 


Two moving bodies, that are interacting via a central force, scatter when the force is repulsive, or when 
an attractive system is unbound. Two-body scattering of bodies is encountered extensively in the fields of 
astronomy, atomic, nuclear, and particle physics. The probability of such scattering is most conveniently 
expressed in terms of scattering cross sections defined below. 


11.12.1 Total two-body scattering cross section 


The concept of scattering cross section for two-body scat- 
tering is most easily described for the total two-body cross 
section. The probability P that a beam of ng incident point 
particles/second, distributed over a cross sectional area Ap, 
will hit a single solid object, having a cross sectional area O, ——____»—_—_—_—-@ O 
is given by the ratio of the areas as illustrated in figure 11.11. 
That is, 
o 
P= ye (11.134) 

where it is assumed that Ag >> ø. For a spherical target 
body of radius r, the cross section a = rr?. The scattering 
probability P is proportional to the cross section ø which Figure 11.11: Scattering probability for an 
is the cross section of the target body perpendicular to the incident beam of cross sectional area A by a 
beam; thus ø has the units of area. target body of cross sectional area ø. 

Since the incident beam of ng incident point parti- 
cles/second, has a cross sectional area Ag, then it will have 
an areal density J given by 


I= = beam particles/m? / sec (11.135) 
B 
The number of beam particles scattered per second Ns by this single target scatterer equals 
Ns = Png = — TAg Sot (11.136) 
Ap 


Thus the cross section for scattering by this single target body is 


Ns _ Scattered particles/sec 


o= = 
I incident beam/m” /sec 

Realistically one will have many target scatterers in the target and the total scattering probability increases 
proportionally to the number of target scatterers. That is, for a target comprising an areal density of np 
target bodies per unit area of the incident beam, then the number scattered will increase proportional to the 
target areal density 97. That is, there will be 77Ag scattering bodies that interact with the beam assuming 
that the target has a larger area than the beam. Thus the total number scattered per second Ns by a target 
that comprises multiple scatterers is 


Ns = o np Ap =onpnr (11.137) 
B 


Note that this is independent of the cross sectional area of the beam assuming that the target area is larger 
than that of the beam. That is, the number scattered per second is proportional to the cross section o times 
the product of the number of incident particles per second, ng, and the areal density of target scatterers, 
nr. Typical cross sections encountered in astrophysics are ø ~ 10!4m?, in atomic physics: o = 1072, 
and in nuclear physics; o ~ 1078m? = barns.’ 

N. B., the above proof assumed that the target size is larger than the cross sectional area of the incident 
beam. If the size of the target is smaller than the beam, then ng is replaced by the areal density /sec of the 
beam 7p and ny is replaced by the number of target particles ny and the cross-sectional size of the target 
cancels. 


The term "barn" was chosen because nuclear physicists joked that the cross sections for neutron scattering by nuclei were 
as large as a barn door. 
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11.12.2 Differential two-body scattering cross section 


The differential two-body scattering cross section gives much 
more detailed information of the scattering force than does 
the total cross section because of the correlation between the 
impact parameter and the scattering angle. That is, a mea- 
surement of the number of beam particles scattered into a 
given solid angle as a function of scattering angles 0,6 probes 
the radial form of the scattering force. 

The differential cross section for scattering of an incident 
beam by a single target body into a solid angle dQ at scat- 
tering angles 0, ¢ is defined to be 

do _ 1dNsz (6,¢) 21 b db 
20 (09) = 7 dq (11.138) dx 
where the right-hand side is the ratio of the number scattered 
per target nucleus into solid angle dQ(6,¢), to the incident 
beam intensity I particles/m? /sec. 

Similar reasoning used to derive equation 11.137 leads to 
the number of beam particles scattered into a solid angle 
dQ for ng beam particles incident upon a target with areal 
density ny is 


Figure 11.12: The equivalent one-body prob- 
lem for scattering of a reduced mass u by a 
force centre in the centre of mass system. 


NSO) = nop (60) (11.139) 

Consider the equivalent one-body system for scattering of one body by a scattering force center in 
the center of mass. As shown in figures 11.6 and 11.12, the perpendicular distance between the center of 
force of the two body system and trajectory of the incoming body at infinite distance is called the impact 
parameter b. For a central force the scattering system has cylindrical symmetry, therefore the solid angle 
dQ(6¢) = sin @d6d¢ can be integrated over the azimuthal angle ¢ to give dQ(0) = 27 sin 0d6. 

For the inverse-square, two-body, central force there is a one-to-one correspondence between impact 
parameter b and scattering angle @ for a given bombarding energy. In this case, assuming conservation of 
flux means that the incident beam particles passing through the impact-parameter annulus between b and 
b + db must equal the the number passing between the corresponding angles 0 and 0 + d0. That is, for an 
incident beam flux of I particles/m?/sec the number of particles per second passing through the annulus is 


do 


121b |db| = 2751 sind |do] (11.140) 
The modulus is used to ensure that the number of particles is always positive. Thus 
do b |db 
AG ang H (11.141) 


11.12.3 Impact parameter dependence on scattering angle 


If the function b = f(0, Eem) is known, then it is possible to evaluate Ed which can be used in equation 
11.141 to calculate the differential cross section. A simple and important case to consider is two-body elastic 
scattering for the inverse-square law force such as the Coulomb or gravitational forces. To avoid confusion in 
the following discussion, the center-of-mass scattering angle will be called 0, while the angle used to define 
the hyperbolic orbits in the discussion of trajectories for the inverse square law, will be called 4. 

In chapter 11.8 the equivalent one-body representation gave that the radial distance for a trajectory for 
the inverse square law is given by 


1 k 
Z = ÉZ [1 + ecosy)] (11.142) 
He 


Note that closest approach occurs when w = 0 while for r — oo the bracket must equal zero, that is 


(11.143) 
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The polar angle y is measured with respect to the symmetry axis of the two-body system which is along 
the line of distance of closest approach as shown in figure 11.6. The geometry and symmetry show that the 
scattering angle 0 is related to the trajectory angle Y, by 


0=7 — 2, (11.144) 
Equation 11.50 gives that 
2 tld 
Vo = f 4 (11.145) 
Tmin 12 
r2 2u (Eom -U — sex) 
Since 
P = bp = OREA (11.146) 
then the scattering angle can be written as. 
—0 i bd 
bo = T =f 4 (11.147) 
Tmin y2 (1 Æ E 2) 
Let u = 4, then 
-0 a bd 
Yoo = T =f < (11.148) 
e a ae) 
For the repulsive inverse square law 
k 
U =-—-—=-—ku (11.149) 
r 


where k is taken to be positive for a repulsive force. Thus the scattering angle relation becomes 


Tmin 


= 0 bd 
boo == z = 2 (11.150) 
(1 + ee — bu?) 


The solution of this equation is given by equation 11.63 to be 


k 
=== NN [1 + ecos a] (11.151) > 
where the eccentricity $i 
2Ecml? +t 
e=4/1+ a (11.152) + 
For r — 00, u = 0 then, as shown previously, dl 
j! — 0 0 ! 
B = COS Wo, = cos “= =sin5 (11.153) ot 
Therefore OB. b 6 
= = ye —1=cot 7 (11.154) 


Figure 11.13: Impact parameter depen- 
dence on scattering angle for Rutherford 


k 0 scattering. 


that is, the impact parameter b is given by the relation 


Thus, for an inverse-square law force, the two-body scattering 
has a one-to-one correspondence between impact parameter b 
and scattering angle 0 as shown schematically in figure 11.13. 
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If k is negative, which corresponds to an attractive inverse square 
law, then one gets the same relation between impact parameter and 
scattering angle except that the sign of the impact parameter b is 
opposite. This means that the hyperbolic trajectory has an interior 
rather than exterior focus. That is, the trajectory partially orbits 
around the center of force rather than being repelled away. 

Note that the distance of closest approach is related to the 
eccentricity e by equation 11.151, therefore 


k 
min = 5D 7 1 11.156 
fmin = zz (1+) (11.156) 
k 1 
min = Fp 1 TTF 11.1 
i 2Eem y sin g ) ( mn 
Note that for 8 = 180° then Figure 11.14: Classical trajectories 
for scattering to a given angle by the 
Em = k = U (fmin) (11.158) repulsive Coulomb field plus the at- 
Tmin tractive nuclear field for three differ- 


ent impact parameters. Path 1 is 
pure Coulomb. Paths 2 and 3 in- 
clude Coulomb plus nuclear interac- 
tions. The dashed parts of trajecto- 
ries 2 and 3 correspond to only the 
Coulomb force acting, i.e. zero nu- 
clear force 


which is what you would expect from equating the incident kinetic 
energy to the potential energy at the distance of closest approach. 

For scattering of two nuclei by the repulsive Coulomb force, if the 
impact parameter becomes small enough, the attractive nuclear force 
also acts leading to impact-parameter dependent effective potentials 
illustrated in figure 11.14. Trajectory 1 does not overlap the nuclear 
force and thus is pure Coulomb. Trajectory 2 interacts at the periph- 
ery of the nuclear potential and the trajectory deviates from pure Coulomb shown dashed. Trajectory 3 
passes through the interior of the nuclear potential. These three trajectories all can lead to the same scat- 
tering angle and thus there no longer is a one-to-one correspondence between scattering angle and impact 
parameter. 


11.12.4 Rutherford scattering 


Two models of the nucleus evolved in the 1900’s, the Rutherford model assumed electrons orbiting around a 
small nucleus like planets around the sun, while J.J. Thomson’s ” plum-pudding” model assumed the electrons 
were embedded in a uniform sphere of positive charge the size of the atom. When Rutherford derived his 
classical formula in 1911 he realized that it can be used to determine the size of the nucleus since the electric 
field obeys the inverse square law only when outside of the charged spherical nucleus. Inside a uniform sphere 
of charge the electric field is E œ r and thus the scattering cross section will not obey the Rutherford relation 
for distances of closest approach that are less than the radius of the sphere of negative charge. Observation 
of the angle beyond which the Rutherford formula breaks down immediately determines the radius of the 
nucleus. 

For pure Coulomb scattering, equation 11.155 can be used to evaluate |2 
11.141, gives the center-of-mass Rutherford scattering cross section 


do A k 1 
= — (11.159) 
do 4 \ 2Eom sin” 5 


This cross section assumes elastic scattering by a repulsive two-body inverse-square central force. For scat- 
tering of nuclei in the Coulomb potential, the constant k is given to be 


, which when used in equation 


k= ZpZ ye? 


11.160 
ATE, ( ) 


The cross section, scattering angle and Eem of equation 11.159 are evaluated in the center-of-mass co- 
ordinate system, whereas usually two-body elastic scattering data involve scattering of the projectiles by a 
stationary target as discussed in chapter 11.13. 
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Gieger and Marsden performed scattering of 7.7 MeV a particles from a thin gold foil and proved that 
the differential scattering cross section obeyed the Rutherford formula back to angles corresponding to a 
distance of closest approach of 10714m which is much smaller that the 10—*%m size of the atom. This 
validated the Rutherford model of the atom and immediately led to the Bohr model of the atom which 
played such a crucial role in the development of quantum mechanics. Bohr showed that the agreement with 
the Rutherford formula implies the Coulomb field obeys the inverse square law to small distances. This work 
was performed at Manchester University, England between 1908 and 1913. It is fortunate that the classical 
result is identical to the quantal cross section for scattering, otherwise the development of modern physics 
could have been delayed for many years. 

Scattering of very heavy ions, such as ?8Pb, can electromagnetically excite target nuclei. For the Coulomb 
force the impact parameter b and the distance of closest approach, "min are directly related to the scattering 
angle 6 by equation 11.155. Thus observing the angle of the scattered projectile unambiguously determines 
the hyperbolic trajectory and thus the electromagnetic impulse given to the colliding nuclei. This process, 
called Coulomb excitation, uses the measured angular distribution of the scattered ions for inelastic excitation 
of the nuclei to precisely and unambiguously determine the Coulomb excitation cross section as a function 
of impact parameter. This unambiguously determines the shape of the nuclear charge distribution. 


11.7 Example: Two-body scattering by an inverse cubic force 


Assume two-body scattering by a potential U = 4 where k > 0. This corresponds to a repulsive two-body 
force F =p. Insert this force into Binet's differential orbit, equation 11.39, gives 


Initially r =00, u = 0, and therefore B = 0. Also at r = œ, E = 5ur2, , that is |t o| = e Then 
dr - dr l l du l 
t= —Y = = = —A—w cos (wy 
dy dy ur? u dp p (uo) 
The initial energy gives that A = 5 v2uE. Hence the orbit equation is 
1 v2uEB . 
u= - = sin (wy) 
r lw 


The above trajectory has a distance of closest approach, Tmin, when Y 
symmetry of the orbit, the scattering angle 0 is given by 


1 
90-29 =" (1-=) 
w 
Since [? = p?b?r2, = 2b? WE then 


6 2ku\—? ENE 


This gives that the impact parameter b is related to scattering angle by 


33: Moreover, due to the 


min — 2 


2 
b2 = k (7 = 0) 
E(27-0)0 
This impact parameter relation can be used in equation 11.141 to give the differential cross section 


do b |dd] k n? (7 — 6) 
dQ sin |d0|  Esin8 (Qn — 0)" 6? 


These orbits are called Cotes spirals. 
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11.13 Two-body kinematics 


So far the discussion has been restricted to the center-of-momentum system. Actual scattering measurements 
are performed in the laboratory frame, and thus it is necessary to transform the scattering angle, energies 
and cross sections between the laboratory and center-of-momentum coordinate frame. In principle the 
transformation between the center-of-momentum and laboratory frames is straightforward, using the vector 
addition of the center-of-mass velocity vector and the center-of-momentum velocity vectors of the two bodies. 
The following discussion assumes non-relativistic kinematics apply. 

In chapter 2.8 it was shown that, for Newtonian mechanics, the center-of-mass and center-of-momentum 
frames of reference are identical. By definition, in the center-of-momentum frame the vector sum of the 


linear momentum of the incoming projectile, pi" and target, pl are equal and opposite. That is 
pare at poe =0 (11.161) 


Using the center-of-momentum frame, coupled with the conservation of linear momentum, implies that the 
vector sum of the final momenta of the N reaction products, pF"4l also is zero. That is 


i 


N 
ya =0 (11.162) 
i=1 


An additional constraint is that energy conservation relates the initial and final kinetic energies by 


(pipitialy” (pinitial)? E (pgina)? (pgina)? 
2mp F 2MrT TH- 2mp i‘ 2mr (103 


where the Q value is the energy contributed to the final total kinetic energy by the reaction between the 
incoming projectile and target. For exothermic reactions, Q > 0, the summed kinetic of the reaction products 
exceeds the sum of the incoming kinetic energies, while for endothermic reactions, Q < 0, the summed kinetic 
energy of the reaction products is less than that of the incoming channel. 

For two-body kinematics, the following are three advantages to working in the center-of-momentum frame 
of reference. 


1. The two incident colliding bodies are colinear as are the two final bodies. 


2. The linear momenta for the two colliding bodies are identical in both the incident channel and the 
outgoing channel. 


3. The total energy in the center-of-momentum coordinate frame is the energy available to the reac- 
tion during the collision. The trivial kinetic energy of the center-of-momentum frame relative to the 
laboratory frame is handled separately. 


The kinematics for two-body reactions is easily determined using the conservation of linear momentum 
along and perpendicular to the beam direction plus the conservation of energy, 11.161 — 11.163. Note that it 
is common practice to use the term “center-of-mass” rather than “center-of-momentum” in spite of the fact 
that, for relativistic mechanics, only the center-of-momentum is a meaningful concept. 

General features of the transformation between the center-of-momentum and laboratory frames of refer- 
ence are best illustrated by elastic or inelastic scattering of nuclei where the two reaction products in the final 
channel are identical to the incident bodies. Inelastic excitation of an excited state energy of AE... in either 
reaction product corresponds to Q = —AEerc, while elastic scattering corresponds to Q = —A Eere = 0. 

For inelastic scattering, the conservation of linear momenta for the outgoing channel in the center-of- 
momentum simplifies to 

poe + pend =0 (11.164) 


that is, the linear momenta of the two reaction products are equal and opposite. 

Assume that the center-of-momentum direction of the scattered projectile is at an angle VÊ, = Y relative 
to the direction of the incoming projectile and that the scattered target nucleus is scattered at a center- 
of-momentum direction vn = 71-9. Elastic scattering corresponds to simple scattering for which the 


magnitudes of the incoming and outgoing projectile momenta are equal, that is, [pE0!| = [pfritial!. 
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Figure 11.15: Vector hodograph of the scattered projectile and target velocities for a projectile, with incident 
velocity v;, that is elastically scattered by a stationary target body. The circles show the magnitude of the 
projectile and target body final velocities in the center of mass. The center-of-mass velocity vectors are 
shown as dashed lines while the laboratory vectors are shown as solid lines. The left hodograph shows 
normal kinematics where the projectile mass is less than the target mass. The right hodograph shows 
inverse kinematics where the projectile mass is greater than the target mass. For elastic scattering up = uh. 


Velocities 


The transformation between the center-of-momentum and laboratory frames requires knowledge of the par- 
ticle velocities which can be derived from the linear momenta since the particle masses are known. Assume 
that a projectile, mass mp, with incident energy Ep in the laboratory frame bombards a stationary target 
with mass mr. The incident projectile velocity v; is given by 


2E 
vi = 4| = (11.165) 
mp 


The initial velocities in the laboratory frame are taken to be 


wp = 0; (Initial Lab velocities) 


WT = 0 
The final velocities in the laboratory frame after the inelastic collision are 


wp (Final Lab velocities) 


wr 


In the center-of-momentum coordinate system, equation 11.10 implies that the initial center-of-momentum 
velocities are 


mr 
UP = 0; 
mp + MT 
ie = y (11.166) 
mp + MT 


It is simple to derive that the final center-of-momentum velocities after the inelastic collision are given 
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by 
ir E 
P mp T mMT mp 
2 
or ELN (11.167) 


The energy É is defined to be given by 


B= Ep +Q0 + Z2) (11.168) 
mT 
where Q = —AE which is the excitation energy of the final excited states in the outgoing channel. 


Angles 


The angles of the scattered recoils are written as 


0, (Final laboratory angles) 
Pap 
and 
oF = Y (Final CM angles) 
Y, = r—yY 


where Y is the center-of-mass (center-of-momentum) scattering angle. 
Figure 11.15 shows that the angle relations between the laboratory and center of momentum frames for 
the scattered projectile are connected by 


-gP P 
a — Sta») _ mp [EP _, (11.169) 
sin Diab MT E 
he 

E mp 1 mp 1 

a - = > (11.170) 

m m m mptm 
T 1+ 0470) Pal Le me rme) 

and fe is the energy per nucleon on the incident projectile. 


Equation 11.169 can be rewritten as 


sin v, 


Peso 
tan Diab Po 
cos Von +T 


(11.171) 


Another useful relation from equation 11.169 gives the center-of-momentum scattering angle in terms of 
the laboratory scattering angle. 
oP = sint (rsin 0i) + Os (11.172) 


This gives the difference in angle between the lab scattering angle and the center-of-momentum scattering 
angle. Be careful with this relation since YẸ, is two-valued for inverse kinematics corresponding to the two 
possible signs for the solution. 


The angle relations between the lab and center-of-momentum for the recoiling target nucleus are connected 


by 
: T _ gf / 
sin(W 7 Diab) = Ep =F (11.173) 
sin Ola E 


ot =sin 1 (7 sin 67.) + Ons (11.174) 


That is 
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Figure 11.16: The kinematic correlation of the laboratory and center-of-mass scattering angles of the recoiling 
projectile and target nuclei for scattering for 4.3MeV /nucleon '°4Pd on 7°°Pb (left) and for the inverse 
4.3MeV /nucleon ?°8Pb on *“*Pa (right). The projectile scattering angles are shown by solid lines while the 
recoiling target angles are shown by dashed lines. The blue curves correspond to elastic scattering, that is 
Q = 0, while the red curves correspond to inelastic scattering with Q = —5MeV. 


where 
se 1 1 
es _ 
Vit (+22) 1/14 7h OE) 


Note that 7 is the same under interchange of the two nuclei at the same incident energy /nucleon, and 
that 7 is always larger than or equal to unity since Q is negative. For elastic scattering 7 = 1 which gives 


(11.175) 


(x — WV) (Recoil lab angle for elastic scattering) 


For the target recoil equation 11.173 can be rewritten as 


sin 92, 


T 
tan Orab = AD rs 
COS Venn, +T 


(Target lab to CM angle conversion) 


Velocity vector hodographs provide useful insight into the behavior of the kinematic solutions. As shown 
in figure 11.15, in the center-of-momentum frame the scattered projectile has a fixed final velocity u/p, that 
is, the velocity vector describes a circle as a function of Y. The vector addition of this vector and the velocity 
of the center-of-mass vector —ur gives the laboratory frame velocity wp. Note that for normal kinematics, 
where mp < mr, then |ur| < |u’p| leading to a monotonic one-to-one mapping of the center-of-momentum 
angle Up and Of. However, for inverse kinematics, where mp > mr, then |ur| > |wp| leading to two valued 
Y solutions at any fixed laboratory scattering angle 0. 


Billiard ball collisions are an especially simple example where the two masses are identical and the collision 


P 
is essentially elastic. Then essentially 7 = 7 = 1, oÈ, = Žem, and des = 4 (= — Pal: that is, the angle 


between the scattered billiard balls is 5. 

Both normal and inverse kinematics are illustrated in figure 11.16 which shows the dependence of the 
projectile and target scattering angles in the laboratory frame as a function of center-of-momentum scattering 
angle for the Coulomb scattering of 1°*Pd by 208Pb, that is, for a mass ratio of 2 : 1. Both normal and 
inverse kinematics are shown for the same bombarding energy of 4.3MeV/nucleon for elastic scattering and 
for inelastic scattering with a Q-value of —5MeV. 
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Figure 11.17: Recoil energies, in MeV, versus laboratory scattering angle, shown on the left for scattering 
of 447MeV 1 Pd by SPb with Q = —5.0MeV, and shown on the right for scattering of 894MeV ?°8Pb 
on 1%%Pd with Q = —5.0MeV. 


Since sin(91,, — 67,,) < 1 then equation 11.173 implies that 7 sin 67, < 1. Since 7 is always larger than 
or equal to unity there is a maximum scattering angle in the laboratory frame for the recoiling target nucleus 
given by 
1 
z 


-gT 
sin O max = 


(11.176) 


For elastic scattering 6% = sin~'(4) = 90° since 7 = 1 for both 894MeV ?%8Pb bombarding *%Pd, and 
the inverse reaction using a 447MeV *%Pd beam scattered by a 2%8Pb target. A Q-value of —5MeV 
gives 7 = 1.002808 which implies a maximum scattering angle of 01, = 85.71° for both 894MeV 2°Ppb 
bombarding '°4Pd, and the inverse reaction of a 447MeV 1°4Pd beam scattered by a 28Pb target. As a 
consequence there are two solutions for 97, for any allowed value of 07, , as illustrated in figure 11.16. 
Since sin(9f,, — 07.) < 1 then equation 11.150 implies that 7 sin 6;,, < 1. For a 447MeV '%Pd beam 
scattered by a Pb target “2 = 0.50, thus 7 = 0.5 for elastic scattering which implies that there is no 


upper bound to 67”. This leads to a one-to-one correspondence between 67. and VÊ, for normal kinematics. 
In contrast, the projectile has a maximum scattering angle in the laboratory frame for inverse kinematics 
since a = 2.0 leading to an upper bound to oÈ, given by 


1 
sin 6f = - (11.177) 


For elastic scattering 7 = 2 implying 67, = 30°. In addition to having a maximum value for 67,,,, when 
T > 1, also there are two solutions for Vi, for any allowed value of 0F,,. For the example of 894MeV ?°8Pb 
bombarding 8 Hf leads to a maximum projectile scattering angle of oF = 30.0° for elastic scattering and 


OF, = 29.907° for Q = —5MeV. 
Kinetic energies 


The initial total kinetic energy in the center-of-momentum frame is 


pinti pp T _ (11.178) 

mp + MT 

The final total kinetic energy in the center-of-momentum frame is 
gEina — pinitial | Q =p TY (11.179) 


mp + mr 
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In the laboratory frame the kinetic energies of the scattered projectile and recoiling target nucleus are 
given by 


2 

El — (z mr ) (1+7 + 2rcosw,,) E (11.180) 
P T 

EL» = mera > (1+7? +27 cos0%,, ) E (11.181) 
mp MT 


where de and ye. are the center-of-mass scattering angles respectively for the scattered projectile and 
target nuclei. 

For the chosen incident energies the normal and inverse reactions give the same center-of-momentum 
energy of 298MeV which is the energy available to the interaction between the colliding nuclei. However, 
the kinetic energy of the center-of-momentum is 447 — 298 = 149MeV for normal kinematics and 894— 298 = 
596MeV for inverse kinematics. This trivial center-of-momentum kinetic energy does not contribute to the 
reaction. Note that inverse kinematics focusses all the scattered nuclei into the forward hemisphere which 
reduces the required solid angle for recoil-particle detection. 


Solid angles 


The laboratory-frame solid angles for the scattered projectile and target are taken to be dwp and dwr 
respectively, while the center-of-momentum solid angles are dQ p and dQ respectively. The Jacobian relating 
the solid angles is 


2 
dwp sin 6p, | P P 

= ab | leos(yP, — oE | 11.182 
dp (ae ( 1 b) ( ) 
d dE 

WT [| SMUjap | T T | 

= cos(0..,. — 0 11.183 

dQr E da] ( cm lab) ( ) 


These can be used to transform the calculated center-of-momentum differential cross sections to the 
laboratory frame for comparison with measured values. Note that relative to the center-of-momentum frame, 
the forward focussing increases the observed differential cross sections in the forward laboratory frame and 
decreases them in the backward hemisphere. 


Exploitation of two-body kinematics 


Computing the above non-trivial transform relations between the center-of-mass and laboratory coordinate 
frames for two-body scattering is used extensively in many fields of physics. This discussion has assumed non- 
relativistic two-body kinematics. Relativistic two-body kinematics encompasses non-relativistic kinematics 
as discussed in chapter 17.4. Many computer codes are available that can be used for making either non- 
relativistic or relativistic transformations. 

It is stressed that the underlying physics for two interacting bodies is identical irrespective of whether 
the reaction is observed in the center-of-mass or the laboratory coordinate frames. That is, no new physics is 
involved in the kinematic transformation. However, the transformation between these frames can dramati- 
cally alter the angles and velocities of the observed scattered bodies which can be beneficial for experimental 
detection. For example, in heavy-ion nuclear physics the projectile and target nuclei can be interchanged 
leading to very different velocities and scattering angles in the laboratory frame of reference. This can greatly 
facilitate identification and observation of the velocities vectors of the scattered nuclei. In high-energy physics 
it is advantageous to collide beams having identical, but opposite, linear momentum vectors, since then the 
laboratory frame is the center-of-mass frame, and the energy required to accelerate the colliding bodies is 
minimized. 
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11.14 Summary 


This chapter has focussed on the classical mechanics of bodies interacting via conservative, two-body, central 
interactions. The following are the main topics presented in this chapter. 


Equivalent one-body representation for two bodies interacting via a central interaction The 
equivalent one-body representation of the motion of two bodies interacting via a two-body central interaction 
greatly simplifies solution of the equations of motion. The position vectors rı and ra are expressed in terms 
of the center-of-mass vector R plus total mass M = mı + ma while the position vector r, plus associated 
reduced mass y = 22 describe the relative motion of the two bodies in the center of mass. The total 


. mi+ma f . . 
Lagrangian then separates into two independent parts 


1 » 12 
p= MR] His (11.16) 
where the center-of-mass Lagrangian is 
1.2 
Lem = 5H lel — U(r) (11.17) 


Equations 11.10, and 11.11 can be used to derive the actual spatial trajectories of the two bodies expressed 
in terms of rı and ra, from the relative equations of motion, written in terms of R and r, for the equivalent 
one-body solution.. 


Angular momentum Nocther’s theorem shows that the angular momentum is conserved if only a spherically- 
symmetric two-body central force acts between the interacting two bodies. The plane of motion is perpen- 
dicular to the angular momentum vector and thus the Lagrangian can be expressed in polar coordinates 
as 


1 , 
Lem = 5 fa + r4) — U(r) (11.22) 
Differential orbit equation of motion The Binet transformation u = 1 allows the center-of-mass 


Lagrangian Lem for a central force F =f(r)f to be used to express the differential orbit equation for the 


radial motion as E 
u pl 1 
= F 11.39 
dy? oe 12 u? D ( ) 


The Lagrangian, and the Hamiltonian all were used to derive the equations of motion for two bodies inter- 
acting via a two-body, conservative, central interaction. The general features of the conservation of angular 
momentum and conservation of energy for a two-body, central potential were presented. 


Inverse-square, two-body, central force The inverse-square, two-body, central force is of pivotal im- 
portance in nature since it is applies to both the gravitational force and the Coulomb force. The underlying 
symmetries of the inverse-square, two-body, central interaction, lead to conservation of angular momentum, 
conservation of energy, Gauss’s law, and that the two-body orbits follow closed, degenerate, orbits that are 
conic sections, for which the eccentricity vector is conserved. The radial dependence, relative to the force 
center lying at one focus of the conic section, is given by 


1 pk 
where the orbit eccentricity e equals 
2Eml? 
e=4/1+ uk (11.62) 


These lead to Kepler’s three laws of motion for two bodies in a bound orbit due to the attractive gravitational 
force for which k = —Gm mg. The inverse-square law is special in that the eccentricity vector A is a third 
invariant of the motion, where 

A = (p x L) + (ukî) (11.86) 
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The eccentricity vector unambiguously defines the orientation and direction of the major axis of the elliptical 
orbit. The invariance of the eccentricity vector, and the existence of stable closed orbits, are manifestations 
of the dynamical 04 symmetry. 


Isotropic, harmonic, two-body, central force The isotropic, harmonic, two-body, central interaction 
is of interest since, like the inverse-square law force, it leads to closed elliptical orbits described by 


1 Ep kp’, \? 
>= [1+ (: + =) cos 21h — wo) (11.107) 


where the eccentricity e is given by 


1 
kp? 2 e 
Y 
1 = 11.1 
+B) 53 (11.108) 


The harmonic force orbits are distinctly different from those for the inverse-square law in that the force center 
is at the center of the ellipse, rather than at the focus for the inverse-square law force. This elliptical orbit 
is reflection symmetric for the harmonic force, but not for the inverse square force. The isotropic harmonic 
two-body force leads to invariance of the symmetry tensor, A” which is an invariant of the motion analogous 
to the eccentricity vector A. This leads to stable closed orbits, which are manifestations of the dynamical 
SU3 symmetry. 


Orbit stability Bertrand’s theorem states that only the inverse square law and the linear radial depen- 
dences of the central forces lead to stable closed bound orbits that do not precess. These are manifestation 
of the dynamical symmetries that occur for these two specific radial forms of two-body forces. 


The three-body problem The difficulties encountered in solving the equations of motion for three bodies, 
that are interacting via two-body central forces, was discussed. The three-body motion can include the 
existence of chaotic motion. It was shown that solution of the three-body problem is simplified if either the 
planar approximation, or the restricted three-body approximation, are applicable. 


Two-body scattering The total and differential two-body scattering cross sections were introduced. It 
was shown that for the inverse-square law force there is a simple relation between the impact parameter b 
and scattering angle 0 given by 


k 0 
= t= 11.155 
b JEL, co 5 ( ) 
This led to the solution for the differential scattering cross-section for Rutherford scattering due to the 
Coulomb interaction. 5 
do 1 k 1 
— = -| =] —>5 11.159 
dQ 4 (=) sinf 2 ( ) 


2 


This cross section assumes elastic scattering by a repulsive two-body inverse-square central force. For scat- 
tering of nuclei in the Coulomb potential the constant k is given to be 


Z Zre? 
k = PT 11.160 

ATE, ( ) 
Two-body kinematics The transformation from the center-of-momentum frame to laboratory frames of 
reference was introduced. Such transformations are used extensively in many fields of physics for theoretical 
modelling of scattering, and for analysis of experiment data. 
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Workshop exercises 


1. Listed below are several statements concerning central force motion. For each statement, give the reason for 
why the statement is true. If a statement is only true in certain situations, then explain when it holds and 
when it doesn't. The system referred to below consists of mass my located at rı and mass Ma located at ro. 


e The potential energy of the system depends only on the difference rı — r2, not on rı and ro separately. 
e The potential energy of the system depends only on the magnitude of rı — r2, not the direction. 

e It is possible to choose an inertial reference frame in which the center of mass of the system is at rest. 
e The total energy of the system is conserved. 


e The total angular momentum of the system is conserved. 
2,2 
2. A particle of mass m moves in a potential U(r) = —Uge7 >". 


(a) Given the constant l, find an implicit equation for the radius of the circular orbit. A circular orbit at 
OV 
Or 


(b) What is the largest value of l for which a circular orbit exists? What is the value of the effective potential 
at this critical orbit? 


r = pis possible if 


=0 


r=p 


where V is the effective potential. 


3. A particle of mass m is observed to move in a spiral orbit given by the equation r = k0, where k is a constant. 
Is it possible to have such an orbit in a central force field? If so, determine the form of the force function. 


4. The interaction energy between two atoms of mass m is given by the Lennard-Jones potential, U(r) = 
€ [(ro/r)P E 2(ro/r)°] 
(a) Determine the Lagrangian of the system where rı and rg are the positions of the first and second mass, 
respectively. 
(b) Rewrite the Lagrangian as a one-body problem in which the center-of-mass is stationary. 
(c) Determine the equilibrium point and show that it is stable. 


(d) Determine the frequency of small oscillations about the stable point. 


5. Consider two bodies of mass m in circular orbit of radius r9/2, attracted to each other by a force F(r) , where 
r is the distance between the masses. 


(a) Determine the Lagrangian of the system in the center-of-mass frame (Hint: a one-body problem subject 
to a central force). 


(b) Determine the angular momentum. Is it conserved? 
(c) Determine the equation of motion in r in terms of the angular momentum and |F(r)|. 


(d) Expand your result in (c) about an equilibrium radius ro and show that the condition for stability 


;. F’(ro) 3 
1s, F (ro) + To > 0 


6. Consider two charges of equal magnitude q connected by a spring of spring constant k’ in circular orbit. Can 
the charges oscillate about some equilibrium? If so, what condition must be satisfied? 


7. Consider a mass m in orbit around a mass M, which is subject to a force F = -4 f, where r is the distance 
between the masses. Show that the eccentricity vector A = p x L — uk f is conserved. 
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Problems 


1. Show that the areal velocity is constant for a particle moving under the influence of an attractive force given 
by F(r) = —kr. Calculate the time averages of the kinetic and potential energies and compare with the the 
results of the virial theorem. 


2. Assume that the Earth’s orbit is circular and that the Sun’s mass suddenly decreases by a factor of two. (a) 
What orbit will the earth then have? (b) Will the Earth escape the solar system? 


3. Discuss the motion of a particle in a central inverse-square-law force field for a superimposed force whose 
magnitude is inversely proportional to the cube of the distance from the particle to force center; that is 


k A 


r2 


F(r)=- (k, A > 0) 


r3 
Show that the motion is described by a precessing ellipse. Consider the cases 


a) à< É, b) A= E, c) à> E where l is the angular momentum and y the reduced mass. 


4. A communications satellite is in a circular orbit around the earth at a radius R and velocity v. A rocket 
accidentally fires quite suddenly, giving the rocket an outward velocity v in addition to its original tangential 
velocity v. 


a) Calculate the ratio of the new energy and angular momentum to the old. 


b) Describe the subsequent motion of the satellite and plot T(r), U(r), the net effective potential, and E(r) 
after the rocket fires. 


5. Two identical point objects, each of mass m are bound by a linear two-body force F = —kr where r is the 
vector distance between the two point objects. The two point objects each slide on a horizontal frictionless 
plane subject to a vertical gravitational field g. The two-body system is free to translate, rotate and oscillate 
on the surface of the frictionless plane. 


a) Derive the Lagrangian for the complete system including translation and relative motion. 
b) Use Noether’s theorem to identify all constants of motion. 

c) Use the Lagrangian to derive the equations of motion for the system. 

d) Derive the generalized momenta and the corresponding Hamiltonian. 

e) Derive the period for small amplitude oscillations of the relative motion of the two masses. 


6. A bound binary star system comprises two spherical stars of mass M1 and ma bound by their mutual gravita- 
tional attraction. Assume that the only force acting on the stars is their mutual gravitation attraction and let 
r be the instantaneous separation distance between the centers of the two stars where r is much larger than 
the sum of the radii of the stars. 


a) Show that the two-body motion of the binary star system can be represented by an equivalent one-body system 
and derive the Lagrangian for this system. 

b) Show that the motion for the equivalent one-body system in the center of mass frame lies entirely in a plane 
and derive the angle between the normal to the plane and the angular momentum vector. 

c) Show whether Hem is a constant of motion and whether it equals the total energy. 

d) It is known that a solution to the equation of motion for the equivalent one-body orbit for this gravitational 
force has the form 


1 pk 

F = TE [1 + €cos 0] 
and that the angular momentum is a constant of motion L = l. Use these to prove that the attractive force leading 
to this bound orbit is 


F=—f 


r2 


where k must be negative. 
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7 When performing the Rutherford experiment, Gieger and Marsden scattered 7.7MeV “He particles (alpha 
particles) from 2387 at a scattering angle in the laboratory frame of Q = 90°. Derive the following observables 
as measured in the laboratory frame. 


(a) The recoil scattering angle of the 238] in the laboratory frame. 

(b) The scattering angles of the “He and 298U in the center-of-mass frame 
( 
(d 


(e) The distance of closest approach fmin 


) 
) 
c) The kinetic energies of the “He and 2%8U in the laboratory frame 
) The impact parameter 

) 


Chapter 12 


Non-inertial reference frames 


12.1 Introduction 


Newton’s Laws of motion apply only to inertial frames of reference. Inertial frames of reference make it 
possible to use either Newton’s laws of motion, or Lagrangian, or Hamiltonian mechanics, to develop the 
necessary equations of motion. There are certain situations where it is much more convenient to treat the 
motion in a non-inertial frame of reference. Examples are motion in frames of reference undergoing trans- 
lational acceleration, rotating frames of reference, or frames undergoing both translational and rotational 
motion. This chapter will analyze the behavior of dynamical systems in accelerated frames of reference, 
especially rotating frames such as on the surface of the Earth. Newtonian mechanics, as well as the La- 
grangian and Hamiltonian approaches, will be used to handle motion in non-inertial reference frames by 
introducing extra inertial forces that correct for the fact that the motion is being treated with respect to a 
non-inertial reference frame. These inertial forces are often called fictitious even though they appear real in 
the non-inertial frame. The underlying reasons for each of the inertial forces will be discussed followed by a 
presentation of important applications. 


12.2 Translational acceleration of a reference frame 


Consider an inertial system (£ fix, Y fia, 2fi0) Which is fixed 
in space, and a non-inertial system (Y...) Ymov: Zmov) that 
is moving in a direction relative to the fixed frame such as Zinoving 
to maintain constant orientations of the axes relative to the 
fixed frame, as illustrated in figure 12.1. The fixed frame is 
designated to be the unprimed frame and, to avoid confu- 
sion the subscript fix is attached to the fixed coordinates 
taken with respect to the fixed coordinate frame. Similarly, 
the translating reference frame, which is undergoing trans- 
lational acceleration, has the subscript mov attached to the 
coordinates taken with respect to the translating frame of 
reference. Newton’s Laws of motion are obeyed only in the 
inertial (unprimed) reference frame. The respective position 
vectors are related by 


, 
g Y moving 


rio = Ri Flmoy (12.1) 


where Y fi, is the vector relative to the fixed frame, r',o, is 


the vector relative to the translationally accelerating frame "toed 
and Riz is the vector from the origin of the fixed frame to 
the origin of the accelerating frame. Differentiating equation 
12.1 gives the velocity vector relation Figure 12.1: Inertial reference frame (un- 
Vee (12.2) primed), and translational accelerating frame 
(primed). 
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dr fix dr’ dR fix Mex . R 3 
where V fia = TEE, Vinov = Te and V fir = 2. Similarly the acceleration vector relation is 
1 
afis = A fix t+ anon (12.3) 
Bris 21 d Toi QR fiz 
where afir = 5) nov = ¿3% and A fis = i- 


In the fixed frame, Newton’s laws give that 


The force in the fixed frame can be separated into two terms, the acceleration of the accelerating frame of 
reference A fiv plus the acceleration with respect to the accelerating frame a; 


F pig = MA Fig FM moy (12.5) 


Relative to the accelerating reference frame the acceleration is given by 


/ = 
Mamov = 


The accelerating frame of reference can exploit Newton’s Laws of motion using an effective translational 
force Fipan = E fia — MA fiz. The additional -mA fix term is called an inertial force; it can be altered by 
choosing a different non-inertial frame of reference, that is, it is dependent on the frame of reference in which 


the observer is situated. 


12.3 Rotating reference frame 


Consider a rotating frame of reference which will be designated as the double-primed (rotating) frame 
to differentiate it from the non-rotating primed (moving) frame, since both of which may be undergoing 
translational acceleration relative to the inertial fixed unprimed frame as described above. 


12.3.1 Spatial time derivatives in a rotating, non-translating, reference frame 


For simplicity assume that R fio = V fia = 0, that is, the 
primed reference frame is stationary and identical to the fixed 
stationary unprimed frame. The double-primed (rotating) oO 
frame is a non-inertial frame rotating with respect to the 
origin of the fixed primed frame. Appendix D.2.3 shows that Q 
an infinitessimal rotation d0 about an instantaneous axis of 
rotation leads to an infinitessimal displacement dr* where 
dr? = d0 x Y! 


mov 


(12.7) 


Consider that during a time dt, the position vector in the fixed 
primed reference frame moves by an arbitrary infinitessimal 
distance dr”, ,,. As illustrated in figure 12.2, this infinitessi- 
mal distance in the primed non-rotating frame can be split 
into two parts: 

a) dr® = d0 x rlaou which is due to rotation of the rotating 
frame with respect to the translating primed frame. 
b) (dr”,,) which is the motion with respect to the rotating 
(double-primed) frame. 

That is, the motion has been arbitrarily divided into xo! 
a part that is due to the rotation of the double-primed da PAR 
frame, plus the vector displacement measured in this rotating 
(double-primed) frame. It is always possible to make such a 


decomposition of the displacement as long as the vector sum 
can be written as Figure 12.2: Infinitessimal displacement in 


the non rotating primed frame and in the ro- 
dre). = dr A dO X O BAN (12.8) tating double-primed reference frame frame. 


r 


moving 


y”: 
> 7 Frotate 


4 
Ymoving 
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Since d0 = wdt then the time differential of the displacement, equation 12.8, can be written as 


dr’ dr” 
Pr =A ra w X Thov (12.9) 


. . . . . . / 
The important conclusion is that a velocity measured in a non-rotating reference frame (4) can be 
mov 


di 
expressed as the sum of the velocity ES) , measured relative to a rotating frame, plus the term w x r’,,.,,, 
t 


ro 
which accounts for the rotation of the frame. The division of the dr’, vector into two parts, a part due to 
rotation of the frame plus a part with respect to the rotating frame, is valid for any vector as shown below. 


12.3.2 General vector in a rotating, non-translating, reference frame 


Consider an arbitrary vector G which can be expressed in terms of components along the three unit vector 
basis ef? °’? in the fixed inertial frame as 


3 
G=) Gf ef" (12.10) 
i=1 
Neglecting translational motion, then it can be expressed in terms of the three unit vectors in the non-inertial 


rotating frame unit vector basis 87% as 


G= a, Jot e” (12.11) 
i! 


x Tot 


Since the unit basis vectors 6°" are constant in the rotating frame, that is, 


der’ 
= 12.12 
(e dt e ; ( 


arot 


then the time derivatives of G in the rotating coordinate system 67° can be written as 


dG = (dG 4 
— = e,” 12.13 
( dt E 2, ( dt la a ) 


The inertial-frame time derivative taken with components along the rotating coordinate basis €7%, equation 
12.11, is 
3 3 
dG dG; a 
— = — 12.14 
(cr), (E), 76 rot ae ( ) 
LT ¿=1 ro i=l 
Substitute the unit vector 87% for r/.,.,, in equation 12.9, plus using equation 12.12, gives that 
der 
( = ) = w x ere (12.15) 
Substitute this into the second term of equation 12.14 gives 
dG dG 
— = | — xG 12.16 
(a) E i ! 


This important identity relates the time derivatives of any vector expressed in both the inertial frame and 
the rotating non-inertial frame bases. Note that the w x G term originates from the fact that the unit 
basis vectors of the rotating reference frame are time dependent with respect to the non-rotating frame basis 
vectors as given by equation (12.15). Equation (12.16) is used extensively for problems involving rotating 
frames. For example, for the special case where G = r’, then equation (12.16) relates the velocity vectors in 
the fixed and rotating frames as given in equation (12.9). 

Another example is the vector w 


dw dw dw 
cane ie ge penta =p =ü 12.17 
Gl (T) tex Ge > 


That is, the angular acceleration w has the same value in both the fixed and rotating frames of reference. 
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12.4 Reference frame undergoing rotation plus translation 


Consider the case where the system is accelerating in translation as well as rotating, that is, the primed 
frame is the non-rotating translating frame. The position vector r fis is taken with respect to the inertial 
fixed unprimed frame which can be written in terms of the fixed unit basis vectors (ene k fia). This Y fiz 
vector can be written as the vector sum of the translational motion Riz of the origin of the rotating system 


with respect to the fixed frame, plus the position r/,,,,, with respect to this translating primed frame basis 


rfi = R fiz + Pmov (12.18) 


dr _ (dR E dv" ov (12.19) 
dt) iy \ db) jia dt l 


The vector dr’ is the position with respect to the translating frame of reference which can be expressed in 


The time differential is 


a 


terms of the unit vectors as a Erga): 


Equation 12.19 takes into account the translational motion of the moving primed frame basis. Now, 
assuming that the double primed frame rotates about the origin of the moving primed frame, then the net 
displacement with respect to the original inertial frame basis can be combined with equation 12.9 leading to 


the relation d wes ae! 
r r ; 
paa =|= 12.20 
(=)... Coa ) te a l ) 


Here the double-primed frame is both rotating and translating. Vectors in this frame are expressed in terms 


of the unit basis vectors (Pn Knor) 


Expressed as velocities, equation 12.20 can be written as 
V fis = V fia + Vrot tW X Tmov (12.21) 


where: 

V fía is the velocity measured with respect to the inertial (unprimed) frame basis. 

V fix is the velocity of the origin of the non-inertial translating (primed) frame basis with respect to the 
origin of the inertial (unprimed) frame basis. 

vit is the velocity of the particle with respect to the non-inertial rotating (double-primed) frame basis 
the origin of which is both translating and rotating. 
w xr, is the motion of the rotating (double-primed) frame with respect to the linearly-translating 
(primed) frame basis. 

Thus this relation takes into account both the translational velocity plus rotation of the reference coor- 
dinate frame basis vectors. 


12.5 Newton’s law of motion in a non-inertial frame 


The acceleration of the system in the rotating inertial frame can be derived by differentiating the general 
velocity relation for v, equation 12.21, in the fixed frame basis which gives 


dv fi dV fi dv” dr! 
(Ea (Eat Fact E toc), 
dt fixed dt fixed dt fixed dt fixed dt fixed 


Now we wish to use the general transformation to a rotating frame basis which requires inclusion of the time 
dependence of the unit vectors in the rotating frame, that is, 


7) dv" 
Prot = [eet +w xv" (12.23) 
rot 
dt fixed dt rotating 
d d 
= xr = (Z| xr (12.24) 
dt fixed dt rot 
/ 
w x (=) = wX va tw x (w x rov) (12.25) 
fixed 
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Using equations 12.23, 12.24, 12.25 gives 


afis = A fia + Ajos + 2W X Vege + WX (WX Tino) +H X Tino (12.26) 


Ww Ww 
oe s : d ; sot d 
where the acceleration in the rotating frame is a”, = ( e) while the velocity is vp; = ( e) and 
rot rot 


A fia is with respect to the fixed frame. 
Newton’s laws of motion are obeyed in the inertial frame, that is 


n 
rot 


” 


. / 
rot +wx Peal) 


+ 2w xvi Hw x (w x r, 


mon) 


F fis = Mafis = M (Afis +a (12.27) 

In the double-primed frame, which may be both rotating and accelerating in translation, one can ascribe an 

effective force FES that obeys an effective Newton’s law for the acceleration a”, in the rotating frame 
FES = ma" 


° / 
rot rot tw x Tov) 


=F fig — MÍA fig + 20 X Vio HW X (w x r; 


mon) 


(12.28) 


Note that the effective force po? comprises the physical force F firea minus four non-inertial forces that are 


introduced to correct for the fact that the rotating reference frame is a non-inertial frame. 


12.6 Lagrangian mechanics in a non-inertial frame 


The above derivation of the equations of motion in the rotating frame is based on Newtonian mechanics. 
Lagrangian mechanics provides another derivation of these equations of motion for a rotating frame of 
reference by exploiting the fact that the Lagrangian is a scalar which is frame independent, that is, it is 
invariant to rotation of the frame of reference. 

The Lagrangian in any frame is given by 


L= mv -v—U(r) (12.29) 


The scalar product v -v is the same in any rotated frame and can be evaluated in terms of the rotating 
frame variables using the same decomposition of the translational plus rotational motion as used previously 
and given in equation 12.21. 

Equation (12.21) decomposes the velocity in the fixed inertial frame v pi, into four vector terms, the 
translational velocity V fix of the translating frame, the velocity in the rotating-translating frame v% +, and 
rotational velocity (w x rhos). Using equations 12.29 and 12.21, plus appendix equation B.21 for the triple 


products, gives that the Lagrangian evaluated using V fiz'V fig equals 


1 
L= 3” [Vio V piat vot Vor ay 2V fia" Vrot E 2V fiz : (w x Log) sE rot i (w x Tain) T (w x w] —U (r) 
(12.30) 
This can be used to derive the canonical momentum in the rotating frame 
OL 
Prot a Ov", =m Varo +wx Eriga) (12.31) 


rot 


The Lagrange equations can be used to derive the equations of motion in terms of the variables evaluated 
in the rotating reference frame. The required Lagrange derivatives are 


d OL 
dt Ov" =m [A fia tarot + (w x viot) + (ù x Eneo) lo (12.32) 
rot 
and re 
gr TM [E x V fia) = (w X Vos) =w x (w X Fnov) noe — VU (12.33) 


where the scalar triple product, equation B.21, has been used. Thus the Lagrange equations give for the 
rotating frame basis that 


ma” =-VU — mÍA fiat (w x V fiz) +2 (w x Vrot) +wx (w x Bau) + (w x Tinov)lrot (12.34) 
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The external force is identified as F firea = — VU. Equation 12.16 can be used to transform between the 
fixed and the rotating bases. 


Ajis = [Ajat (w x V) (12.35) 


fix rot 
This leads to an effective force in the non-inertial translating plus rotating frame that corresponds to an 
effective Newtonian force of 


n 


mayo, = F — mlA fis + 2w X Via Hw x (w X Poy) + (0 X Pinoy)] (12.36) 


rotam 


riff = 
where A fi, is expressed in the fixed frame. The derivation of equation 12.36 using Lagrangian mechanics, 
confirms the identical formula 12.29 derived using Newtonian mechanics. 

The four correction terms for the non-inertial frame basis correspond to the following effective forces. 

Translational acceleration: F¢ff, = —mA fix is the usual inertial force experienced in a linearly acceler- 
ating frame of reference, and where A fix is with respect to the fixed frame . 

Coriolis force; Fff = —2mw x v”,, This is a new type of inertial force that is present only when a 
particle is moving in the rotating frame. This force is proportional to the velocity in the rotating frame and 
is independent of the position in the rotating frame 
Centrifugal force: ES = -mw x (w x rinov) This is due to the centripetal acceleration of the particle 
owing to the rotation of the moving axis about the axis of rotation. 

Transverse (azimuthal) force: F¢f/ = —mw x r/,,,, This is a straightforward term due to acceleration of 
the particle due to the angular acceleration of the rotating axes. 

The above inertial forces are correction terms arising from trying to extend Newton’s laws of motion to 
a non-inertial frame involving both translation and rotation. These correction forces are often referred to as 
“fictitious” forces. However, these non-inertial forces are very real when located in the non-inertial frame. 


Since the centrifugal and Coriolis terms are unusual they are discussed below. 


12.7 Centrifugal force 


The centrifugal force was defined as 


Fes = -mw x (w X Lino) (12.37) 


mov 


Note that 


w+ Fo, =0 (12.38) Pie 
therefore the centrifugal force is perpendicular to the axis of 
rotation. 

Using the vector identity, equation B.24, allows the centrifu- 
gal force to be written as 


-mw x(w xr’) 


Fes = —m [(W- roy) w — Tio] (12.39) 


mov mov 
For the case where the radius r’ is perpendicular to w then w-r’ = 
0 and thus for this special case 


Fer = mur’ 12.40 
f 


mov 


The centrifugal force is experienced when riding in a car 
driven rapidly around a bend. The passenger experiences an ap- O 
parent centrifugal (center fleeing) force that thrusts them to the 
outside of the bend relative to the inside of the turning car. In 
reality, relative to the fixed inertial frame, i.e. the road, the fric- 
tion between the car tires and the road is changing the direction Figure 12.3: Centrifugal force. 
of the car towards the inside of the bend and the car seat is caus- 
ing the centripetal (center seeking) acceleration of the passenger. 
A bucket of water attached to a rope can be swung around in a 
vertical plane without spilling any water if the centrifugal force 
exceeds the gravitation force at the top of the trajectory. 
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12.8 Coriolis force 


The Coriolis force was defined to be 
Foor = —2mw x Vio, (12.41) 


where v” is the velocity measured in the ro- 
tating (double-primed) frame. The Coriolis 
force is an interesting force; it is perpendic- 
ular to both the axis of rotation and the ve- 
locity vector in the rotating frame, that is, it 
is analogous to the gv x B Lorentz magnetic 
force . 

The understanding of the Coriolis effect 
is facilitated by considering the physics of a 
hockey puck sliding on a rotating frictionless 


Figure 12.4: Free-force motion of a hockey puck sliding on 
table. Assume that the table rotates with 2 rotating frictionless table of radius R that is rotating with 
T constant angular frequency w out of the page. 


constant angular frequency w = wk about 
the z axis. For this system the origin of the 
rotating system is fixed, and the angular frequency is constant, thus A and w xr’ are zero. Also it is assumed 
that there are no external forces acting on the hockey puck, thus the net acceleration of the puck sliding on 
the table, as seen in the rotating frame, simplifies to 


"= —2w x v" = —2wk x v” + wT ov (12.42) 


arot = rot 


— w x (w X rho) 


The centrifugal acceleration +w?r”,,,, is radially outwards while the Coriolis acceleration —2wk x vir, is to 


the right. Integration of the equations of motion can be used to calculate the trajectories in the rotating 
frame of reference. 

Figure 12.4 illustrates trajectories of the hockey puck in the rotating reference frame when no external 
forces are acting, that is, in the inertial frame the puck moves in a straight line with constant velocity vo. 
In the rotating reference frame the Coriolis force accelerates the puck to the right leading to trajectories 
that exhibit spiral motion. The apparent complicated trajectories are a result of the observer being in the 
rotating frame for which that the straight inertial-frame trajectories of the moving puck exhibit a spiralling 
trajectory in the rotating-frame. 

The Coriolis force is the reason that winds circulate in an anticlockwise direction about low-pressure 
regions in the Earth’s northern hemisphere. It also has important consequences in many activities on earth 
such as ballet dancing, ice skating, acrobatics, nuclear and molecular rotation, and the motion of missiles. 


12.1 Example: Accelerating spring plane pendulum 


Comparison of the relative merits of using a non-inertial frame versus an inertial frame is given by a 
spring pendulum attached to an accelerating fulcrum. As shown in the figure, the spring pendulum comprises 
a mass m attached to a massless spring that has a rest length ro and spring constant k. The system is 
in a vertical gravitational field g and the fulcrum of the pendulum is accelerating vertically upwards with a 
constant acceleration a. Assume that the spring pendulum oscillates only in the vertical 0 plane. 

Inertial frame: 

This problem can be solved in the fixed inertial coordinate system with coordinates (x,y). These coordi- 
nates, and their time derivatives, are given in terms of r and 0 by 


x = rsinó &=rsind +r0cos0 
1 ; 
y = -r cos + zat” y = r0 sinf — t cosé + at 
Thus 


1 
2 


; ; 1 1 
m Es +r +024? + 2at (rò sin 0 — 7° cos 0) + mg 6 cos O — sat") — 5K (r= ro)? 


k (r ro) 


m (i? + y?) mgy 


Dlr Dir 
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The Lagrange equations of motion are given by 
A,L=0 a 


. k 
#— rå” — (a+ g) cos 8 + — (r—ro) =0 


AgpL=0 y 
E os 
642994 249 ng =o 
r r 
The generalized momenta are 
o , 
Pp = = =m*r—matcosé 
Or 
OL . 
Po = —=mr?6+matr sind 
00 
These lead to the corresponding velocities of 
poa + atcos0 
m 
. Po at sin 0 
0 = — - 
mr r 


and thus the Hamiltonian is given by 


H 


Pr? + pob -L 
Pr Po 
2m 2mr? 


t 1 1 
— T o sin 0 + atp, cos O + z" (r — ro)? + ¿mat — mgr cos 0 
7 


The Hamilton equations of motion give that 


OH 


T= =P + atcosd 
Op, m 
. OH Do at sin 6 
0 = — = — — 
Ope mr? r 


These radial and angular velocities are the same as obtained using Lagrangian mechanics. 
The Hamilton equations for p, and pg are given by 


Pe 


t ; 
pr = -— = — pg sind — k (r — ro) + mgcos@ + —= 
Tr mr 


Similarly 
. OH 
Po = 30 
The transformation equations relating the generalized coordinates r,0 are time dependent so the Hamil- 
tonian H does not equal the total energy E. In addition neither the Lagrangian nor the Hamiltonian are 
conserved since they both are time dependent. The fact that the Hamiltonian is not conserved is obvious since 
the whole system is accelerating upwards leading to increasing kinetic and potential energies. Moreover, the 
time derivative of the angular momentum pg is non-zero so the angular momentum pg is not conserved. 
Non-inertial fulcrum frame: 
This system also can be addressed in the accelerating non-inertial fulcrum frame of reference which is 
fixed to the fulcrum of the spring of the pendulum. In this non-inertial frame of reference, the acceleration 
of the frame can be taken into account using an effective acceleration a which is added to the gravitational 


force; that is, g is replaced by an effective gravitational force (g +a). Then the Lagrangian in the fulcrum 
frame simplifies to 


t : ; 
= ~ po cos 6 + atp, sin @ — mgr sin 0 
r 


1 : 1 
L fulcrum am ¿ne + pag +m (g + a) (r COS 0) = z“ (r E roy 


The Lagrange equations of motion in the fulerum frame are given by 
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A, L fulerum =0 
5 ¿2 k 
F=r0 — (a+ g)cos0 + — (r—rog)=0 
m 


Ao L fulcrum =0 


(a+9) . 


a E os 
0+ 7i + ind =0 


These are identical to the Lagrange equations of motion derived in the inertial frame. 
The Lfulerum can be used to derive the momenta in the non-inertial fulcrum frame 


~ _ OL Fulcrum as . 

Pr = — > ~ mr 
Or 

Do — OL futerumr = mr20 
00 


which comprise only a part of the momenta derived in the inertial frame. These partial fulerum momenta 
lead to a Hamiltonian for the fulcum-frame of 


E, ño 
2m  2mr? 


EREE 1 2 

H pulerum = Prr + Do0 S L fulcrum = + zf (r = ro) =m (g + a) r cos 6 

Both L tulerum and H fulcrum are time independent and thus the fulcrum Hamiltonian H fulerum is a constant 
of motion in the fulcrum frame. However, H fulcrum does not equal the total energy which is increasing with 
time due to the acceleration of the fulcrum frame relative to the inertial frame. This example illustrates that 
use of non-inertial frames can simplify solution of accelerating systems. 


12.2 Example: Surface of rotating liquid 


Find the shape of the surface of liquid in a bucket 
that rotates with angular speed w as shown in the ad- 
jacent figure. Assume that the liquid is at rest in the 
frame of the bucket. Therefore, in the coordinate system 
rotating with the bucket of liquid, the centrifugal force is 
important whereas the Coriolis, translational, and trans- 
verse forces are zero. The external force 


F =F’ -mg 


where F’ is the pressure which is perpendicular to the 
surface. At equilibrium the acceleration of the surface is 
zero that is 


ma” =0 = F’+m(g—w x (w x r’)) 
The effective gravitational force is 
Bess =(8—wx (w x r')) 


which must be perpendicular to the surface of the liquid since F' is perpendicular to the surface of a fluid, 
and the net force is zero. In cylindrical coordinates this can be written as 


Serf = YE + pup 


From the figure it can be deduced that 


By integration 


Wn 
z = =— p + constant 
2g 
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This is the equation of a paraboloid and corresponds to a parabolic gravitational equipotential energy surface. 
Astrophysicists build large parabolic mirrors for telescopes by continuously spinning a large vat of glass while 
it solidifies. This is much easier than grinding a large cylindrical block of glass into a parabolic shape. 


12.3 Example: The pirouette 


An interesting application of the Coriolis force is the problem of a spinning ice skater or ballet dancer. 
Her angular frequency increases when she draws in her arms. The conventional explanation is that angular 
momentum is conserved in the absence of any external forces which is correct. Thus since her moment of 
inertia decreases when she retracts her arms, her angular velocity must increase to maintain a constant 
angular momentum L =I w. But this explanation does not address the question as to what are the forces 
that cause the angular frequency to increase? The real radial forces the skater feels when she retracts her 
arms cannot directly lead to angular acceleration since radial forces are perpendicular to the rotation. The 
following derivation shows that the Coriolis force —2mw x vi", acts tangentially to the radial retraction 
velocity of her arms leading to the angular acceleration required to maintain constant angular momentum. 

Consider that a mass m is moving radially at a velocity 7... then the Coriolis force in the rotating frame 
is 


For = —2mw x Vos 


This Coriolis force leads to an angular acceleration of the mass of 


2w x i” 


w= rot Q 
-: (a) 
that is, the rotational frequency decreases if the radius is increased. Note that, as shown in equation 12.17, 
w=w". This nonzero value of w obviously leads to an azimuthal force in addition to the Coriolis force. 
Consider the rate of change of angular momentum for the rotating mass m assuming that the angular 


momentum comes purely from the rotation w. Then in the rotating frame 


d 2 


AR mr. A 
Dor = l 


w) = 2mr" iw + mr”? 


w 
Substituting equation a for w in the second term gives 


Dor =2mr"7"w-2mr"7"w =0 


That is, the two terms cancel. Thus the angular momentum is conserved for this case where the velocity is 
radial. Note that, since pe” is assumed to be colinear with w, then it is the same in both the stationary and 
rotating frames of reference and thus angular momentum is conserved in both frames. In addition, in the 
fixed frame, the angular momentum is conserved if no external torques are acting as assumed above. 

Note that the rotational energy is 


1 
vee = z 
Also the angular momentum is conserved, that is 


po = Iw = lò 


Substituting w = E2 in the rotational energy gives 


2 2 

Po l 
Evo = 2 = — 
t 2 21 


Therefore the rotational energy actually increases as the moment of inertia decreases when the ice skater 
pulls her arms close to her body. This increase in rotational energy is provided by the work done as the 
dancer pulls her arms inward against the centrifugal force. 
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12.9 Routhian reduction for rotating systems 


The Routhian reduction technique, that was introduced in chapter 8.6, is a hybrid variational approach. It 
was devised by Routh to handle the cyclic and non-cyclic variables separately in order to simultaneously 
exploit the differing advantages of the Hamiltonian and Lagrangian formulations. The Routhian reduction 
technique is a powerful method for handling rotating systems ranging from galaxies to molecules, or deformed 
nuclei, as well as rotating machinery in engineering. A valuable feature of the Hamiltonian formulation is 
that it allows elimination of cyclic variables which reduces the number of degrees of freedom to be handled. 
As a consequence, cyclic variables are called ignorable variables in Hamiltonian mechanics. The Lagrangian, 
the Hamiltonian and the Routhian all are scalars under rotation and thus are invariant to rotation of the 
frame of reference. Note that often there are only two cyclic variables for a rotating system, that is, @ = w 
and the corresponding canonical total angular momentum pg = J. 

As mentioned in chapter 8.6, there are two possible Routhians that are useful for handling rotation frames 
of reference. For rotating systems the cyclic Routhian Reyciic simplifies to 


Reyclic(qı, sey an; dı, e. Qs Ps+1> cido Dat) = Heyclic = Lnoncyclic =w: J = L (12.43) 


This Routhian behaves like a Hamiltonian for the ignorable cyclic coordinates w, J. Simultaneously it behaves 
like a negative Lagrangian Lroncyclic for all the other coordinates. 
The non-cyclic Routhian Rnoncyclic complements Reyctic in that it is defined as 


Rnoneyctic(Q1> «> Qn;P1) Ps; ds+1> pesa dni t) = Anoncyclic = Leyelic =H-w-J (12.44) 


This non-cyclic Routhian behaves like a Hamiltonian for all the non-cyclic variables and behaves like a 
negative Lagrangian for the two cyclic variables w,p,,. Since the cyclic variables are constants of motion, 
then Rnoncyclic is a constant of motion that equals the energy in the rotating frame if H is a constant of 
motion. However, Rnoncyctic does not equal the total energy since the coordinate transformation is time 
dependent, that is, the Routhian Rnoncyciic corresponds to the energy of the non-cyclic parts of the motion. 

For example, the Routhian Rnoncyctic for a system that is being cranked about the ¢ axis at some fixed 
angular frequency b = w, with corresponding total angular momentum py = J, can be written as! 


Rnoneycic = H-w-J (12.45) 
1 

= 5m [V-V+v sv? 42V-v? +2V- (wxr)+2w > (w xr’) + (w xr)? | —w-J+U(r) 

Note that Rnoncyctic is a Constant of motion if oe = 0, which is the case when the system is being cranked 

at a constant angular frequency. However the Hamiltonian in the rotating frame Hrot = H —w-J is given 

by Rnoneyctic = Hrot # E since the coordinate transformation is time dependent. The canonical Hamilton 


equations for the fourth and fifth terms in the bracket can be identified with the Coriolis force 2mw x v”, 
while the last term in the bracket is identified with the centrifugal force. That is, define 


2 


1 
Uef = —3m (w x r’) (12.46) 
where the gradient of Uef gives the usual centrifugal force. 
Fef = -VU¿, = sv [wr — (w- Y»! =m wr! — (w-r')w] = —mw x (w x r’) (12.47) 


The Routhian reduction method is used extensively in science and engineering to describe rotational 
motion of rigid bodies, molecules, deformed nuclei, and astrophysical objects. The cyclic variables describe 
the rotation of the frame and thus the Routhian Ruoncyctic = Hrot corresponds to the Hamiltonian for the 
non-cyclic variables in the rotating frame. 


For clarity sections 10.1 to 10.8 of this chapter adopted a naming convention that uses unprimed coordinates with the 
subscript fix for the inertial frame of reference, primed coordinates with the subscript mov for the translating coordinates, and 
double-primed coordinates with the subscript rot for the translating plus rotating frame. For brevity the subsequent discussion 
omits the redundant subscripts fix,mov,rot since the single and double prime superscripts completely define the moving and 
rotating frames of reference. 
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12.4 Example: Cranked plane pendulum 


The cranked plane pendulum, which is also called the rotating plane 
pendulum, comprises a plane pendulum that is cranked around a verti- 
cal axis at a constant angular velocity ¢ = w as determined by some 
external drive mechanism. The parameters are illustrated in the adja- 
cent figure. The cranked pendulum nicely illustrates the advantages of 
working in a non-inertial rotating frame for a driven rotating system. 
Although the cranked plane pendulum looks similar to the spherical pen- g 
dulum, there is one very important difference; for the spherical pendulum e 
po = MÌ? sin? 0ġ is a constant of motion and thus the angular velocity 
varies with 0, i.e. b = <tr, whereas for the cranked plane pendulum, 
the constant of motion is $ =w and thus the angular momentum varies 
with 0, i.e. py = lsin? 0w. For the cranked plane pendulum, the energy 
must flow into and out of the cranking drive system that is providing the Cranked plane pendulum that is 
constraint force to satisfy the equation of constraint cranked around the vertical axis 
with angular velocity fp = w. 


m 


Je = Q =w=0 
The easiest way to solve the equations of motion for the cranked plane pendulum is to use generalized coor- 
dinates to absorb the equation of constraint and applied constraint torque. This is done by incorporating the 
$ = w constraint explicitly in the Lagrangian or Hamiltonian and solving for just 0 in the rotating frame. 
Assuming that $ = w, and using generalized coordinates to absorb the cranking constraint forces, then 
the Lagrangian for the cranked pendulum can be written as. 


1 
L= ¿mee + sin? 6w?) + mgl cos 0 


The momentum conjugate to 0 is 
OL 


Pe = op 


Consider the Routhian Rnoncyclic = p00 -L=H- Pod which acts as a Hamiltonian H,..4 in the rotating 
frame 


ml?6 


2 
. . Pp d k 
Rnoneyetié = pob -L=H- PoP = ar = sil sin? 0 — mgl cos 0 
Note that if b =w is constant, then Rnoncyclic is a constant of motion for rotation about the @ axis since 


it is independent of p. Also acu — -2% = 0 thus the energy in the rotating non-inertial frame of the 


pendulum Rnoncyclic = Hrot = H — pod is a constant of motion, but it does not equal the total energy since 
the rotating coordinate transformation is time dependent. The driver that cranks the system at a constant w 
provides or absorbs the energy dW = dE = wdp¿ as 0 changes in order to maintain a constant w. 

The Routhian Rnoncyclic can be used to derive the equations of motion using Hamiltonian mechanics. 


ò = ORnoncyclic =e Po 

Ope mil? 

ORnone clic A l 

Do = yele = mglsinð |1 — — cos Ow? 
00 g 
Since po = ml26, then the equation of motion is 
o l 
b+ Tsing fi- Ecos =0 (a) 

g 


Assuming that sinb ~ 0, then equation a leads to linear harmonic oscillator solutions about a minimum 
at 0 = 0 if the term in brackets is positive. That is, when the bracket [1 — 5 cos 0w?| > 0 then equation a 
corresponds to a harmonic oscillator with angular velocity Q given by 


g=- f = costa | 
l g 
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The adjacent figure shows the phase-space diagrams for a plane 
pendulum rotating about a vertical axis at angular velocity w for (a) 


Po 
w< Jz and (b) w > JF. The upper phase plot shows small w when 
the square bracket of equation a is positive and the the phase space 
trajectories are ellipses around the stable equilibrium point (0,0). o 
As w increases the bracket becomes smaller and changes sign when e i 
w cos 4 = 4. For larger w the bracket is negative leading to hyper- 
bolic phase space trajectories around the (0, pe) = (0,0) equilibrium 
point, that is, an unstable equilibrium point. However, new sta- a) 
ble equilibrium points now occur at angles (0,p9) = (+00,0) where E 
cos ĝo = 74. That is, the equilibrium point (0,0) undergoes bifurca- 
tion as illustrated in the lower figure. These new equilibrium points 
are stable as illustrated by the elliptical trajectories around these 
points. It is interesting that these new equilibrium points +09 move = z 
to larger angles given by cosdy = ¿2 beyond the bifurcation point 
at ¿4 =1. For low energy the mass oscillates about the minimum 
at 0 = 09 whereas the motion becomes more complicated for higher 
energy. The bifurcation corresponds to symmetry breaking since, R 
under spatial reflection, the equilibrium point is unchanged at low Phase-space diagrams for the plane 
rotational frequencies but it transforms from +09 to —0o once the pendulum cranked at angular velocity 
solution bifurcates, that is, the symmetry is broken. Also chaos can w about a vertical axis. Figure (a) is 
occur at the separatrix that separates the bifurcation. Note that for w < q while (b) is for w > z, 
either the Lagrange multiplier approach, or the generalized force ap- 
proach, can be used to determine the applied torque required to ensure a constant w for the cranked pendulum. 


o 


12.5 Example: Nucleon orbits in deformed nuclei 


Consider the rotation of axially-symmetric, 
prolate-deformed nucleus. Many nuclei have a pro- 
late spheroidal shape, (the shape of a rugby ball) x 
and they rotate perpendicular to the symmetry axis. 
In the non-inertial body-fixed frame, pairs of nucle- 
ons, each with angular momentum j, are bound in 
orbits with the projection of the angular momentum 
along the symmetry axis being conserved with value 
Q = K, which is a cyclic variable. Since the nucleus 
is of dimensions 10~'4m, quantization is important 
and the quantized binding energies of the individual 
nucleons are separated by spacings < 500keV. 

The Lagrangian and Hamiltonian are scalars 
and can be evaluated in any coordinate frame of 
reference. It is most useful to calculate the Hamil- 
tonian for a deformed body in the non-inertial ro- 
tating body-fized frame of reference. The body- 
fixed Hamiltonian corresponds to the Routhian 


Schematic diagram for the strong coupling of a 
nucleon to the deformation axis. The projection of I 
on the symmetry axis is K, and the projection of 7 is 
Q. For axial symmetry Noether’s theroem gives that 

the projection of the angular momentum K on the 


R symmetry axis is a conserved quantity. 
noncyclic 


Rnoneyelic =H-w-J 


where it is assumed that the deformed nucleus has the symmetry axis along the z direction and rotates about 
the x axis. Since the Routhian is for a non-inertial rotating frame of reference it does not include the total 
energy but, if the shape is constant in time, then Rnyoncyctic and the corresponding body-fixed Hamiltonian 
are conserved and the energy levels for the nucleons bound in the spheroidal potential well can be calculated 
using a conventional quantum mechanical model. 

For a prolate spheroidal deformed potential well, the nucleon orbits that have the angular momentum 
nearly aligned to the symmetry axis correspond to nucleon trajectories that are restricted to the narrowest 
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part of the spheroid, whereas trajectories with the angular momentum vector close to perpendicular to the 
symmetry axis have trajectories that probe the largest radii of the spheroid. The Heisenberg Uncertainty 
Principle, mentioned in chapter 3.11.3, describes how orbits restricted to the smallest dimension will have 
the highest linear momentum, and corresponding kinetic energy, and vise versa for the larger sized orbits. 
Thus the binding energy of different nucleon trajectories in the spheroidal potential well depends on the angle 
between the angular momentum vector and the symmetry axis of the spheroid as well as the deformation of 
the spheroid. A quantal nuclear model Hamiltonian is solved for assumed spheroidal-shaped potential wells. 
The corresponding orbits each have angular momenta j; for which the projection of the angular momentum 
along the symmetry axis Q; is conserved, but the projection of ji in the laboratory frame j, is not conserved 
since the potential well is not spherically symmetric. However, the total Hamiltonian is spherically symmetric 
in the laboratory frame, which is satisfied by allowing the deformed spheroidal potential well to rotate freely in 
the laboratory frame, and then j°, jiz, and Qi all are conserved quantities. The attractive residual nucleon- 
nucleon pairing interaction results in pairs of nucleons being bound in time-reversed orbits (j x j)°, that 
is, with resultant total spin zero, in this spheroidal nuclear potential. Excitation of an even-even nucleus 
can break one pair and then the total projection of the angular momentum along the symmetry axis is 
K = |Q £Q2|, depending on whether the projections are parallel or antiparallel. More excitation energy 
can break several pairs and the projections continue to be additive. The binding energies calculated in the 
spheroidal potential well must be added to the rotational energy Erot = Zu? to get the total energy, where 
J is the moment of inertia. Nuclear structure measurements are in good agreement with the predictions of 
nuclear structure calculations that employ the Routhian approach. 


12.10 Effective gravitational force near the surface of the Earth 


Consider that the translational acceleration of the center of 
the Earth can be neglected, and thus a set of non-rotating 
axes through the center of the Earth can be assumed to be 
approximately an inertial frame. The effects of the motion of 
the Earth around the Sun, or the motion of the Solar system 
in our Galaxy, are small compared with the effects due to the 
rotation of the Earth. 

Consider a rotating frame attached to the surface of the 
earth as shown in figure 12.5. The vector with respect to the 
center of the Earth r can be decomposed into a vector to the 
origin of the reference frame fixed to the surface of the Earth 
R, plus the vector with respect to this surface reference frame 


r’. 


r=R+r (12.48) 


If the external force is separated into the gravitational 
term mg, plus some other physical force F, then the acceler- 
ation in the non-inertial surface frame of reference is 


F Figure 12.5: Rotating frame at the surface of 
a’ = —+g- (A + 2w x v +w x (w x r') +% xr) (12.49) the Earth. 
m 


But 


V= (3) = (=) +wxR=wxR (12.50) 
dt fixed dt rotating 


since in the rotating frame (E 


R) _ = 0. Also the acceleration 
rotating 


E (+) = (+) +wx V=wx (w x R) (12.51) 
dt fixed dt rotating 
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since (T) ota eg 0. Substituting this into the above equation gives 
f F / t . f 
a = —+g-(Qwxvtwx (wx [r +R]) +o xr’) 
m 
F 1 . 1 
= —+g-(2wxv +wx (wxr)+% xr’) 
m 


where r is with respect to the center of the Earth. This is as expected directly from equation 12.36. Since 
the angular frequency of the earth is a constant then w x r’ = 0. Thus the acceleration can be written as 


1 


rE E E E E E (12.52) 
m 


The term in the square brackets combines the gravitational acceleration plus the centrifugal acceleration. 

A measurement of the Earth’s gravitational accel- 
eration actually measures the term in the square brack- 
ets in equation 12.52, that is, an effective gravitational 
acceleration where 


Seff =g- w x (w xr) (12.53) 


near the surface of the earth r ~ R. The effective grav- 
itational force does not point towards the center of the 
Earth as shown in figure 12.6. A plumb line points, 
or an object falls, in the direction of geff. The shape 
of the earth is such that the Earth’s surface is per- 
pendicular to geff. This is the reason why the earth is 
distorted into an oblate ellipsoid, that is, it is flattened 
at the poles. 

The angle a between gers and the line pointing 
to the center of the earth is dependent on the latitude 
A = 5-0. Note that the colatitude 0 is taken to be zero 
at the North pole whereas the latitude A is taken to 
be zero at the equator. The angle a can be estimated 


by assuming that r’ << R, then the centrifugal term Figure 12.6: Effective gravitational acceleration. 
then can be approximated by 


-wx(wxR) 


lw x (w x r)| = w?Rsind = w? Roos» (12.54) 


This is quite small for the Earth since w = 0.73 x 1074 rads/s and R = 6371km, leading to a correction 
term w?Rcos A = 0.03 cos A m/s?. Since 


gpp onta = y? Roos Asin À (12.55) 
and 
geet = g — w R cos? A (12.56) 


Then the angle a between ges and g is given by 


horizontal 
Jeff T w? Recos Asin À 


a ~ tana = - = 
q g- w2R cos2 A 


(12.57) 


This has a maximum value at A = 45° which is a = 0.0088. 
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12.11 Free motion on the earth 


The calculation of trajectories for objects as they move near 
the surface of the earth is frequently required for many ap- 
plications. Such calculations require inclusion of the non- 
inertial Coriolis force. In the frame of reference fixed to 
the earth’s surface, assuming that air resistance and other 
forces can be neglected, then the acceleration equals 


al = geff — 2w xv (12.58) 


Neglect the centrifugal correction term since it is very small, 
that is, let geff = g. Using the coordinate axis shown in 
figure 12.7, the surface-frame vectors have components 


w = 0P + w cos Aj’ + wsin Ak’ (12.59) 
and p 
Seff = —gk' (12.60) 
Thus the Coriolis term is 
vj k’ Figure 12.7: Rotating frame fixed on the sur- 
2Qw xv’ = 2/0 weosr wsinA face of the Earth. 
LA GA -I 
Y z 


2m2 (we cos A —wy sin A) P+ (wi sin A) 7 = (wi cos a) R] 
Therefore the equations of motion are 
mi = —myk'—2mP(2'wcosA— ġ'wsin A) +4 wsinA — K't'w cos A] (12.61) 
That is, the components of this equation of motion are 


Y = —2w (2 cos À — y’ sin A) (12.62) 
y = —2wí'sinA 
A 


—g + 2w4' cos À 


Integrating these differential equations gives 


e = —2w(2cosA—ysinA) +25 (12.63) 
y = —2w1'sinA+ Y 
A —gt + 2wx' cos A + 2 


where t, Yo, 2 are the initial velocities. Substituting the above velocity relations into the equation of motion 
for % gives 
# = 2wgt cos À — 2w (2 cos A — yh sin.) — 4w? 2" (12.64) 


The last term 4w?a is small and can be neglected leading to a simple uncoupled second-order differential 
equation in x. Integrating this twice assuming that 7 = yo = 20 = 0, plus the fact that 2wgtcos and 
2w (24 cos À — yj sin A) are constant, gives 

1 


1 
g= zeae cos A — wt? (24 cos A — go sin À) + tot (12.65) 


Similarly, 
y' = (pt — wiht? sin A) (12.66) 


1 
z= z + ¿pt + wiht? cos À (12.67) 
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Consider the following special cases; 


12.6 Example: Free fall from rest 
Assume that an object falls a height h starting from rest at t = 0, x = 0, y = 0, z = h. Then 


1 
xa = gay cos A 
y = 0 
1 
1 = h E E £2 
z 59 
Substituting for t gives 
1 8h3 


¢ = 3 cos A = 


g 

Thus the object drifts eastward as a consequence of the earth’s rotation. Note that relative to the fixed frame 
it is obvious that the angular velocity of the body must increase as it falls to compensate for the reduced 
distance from the axis of rotation in order to ensure that the angular momentum is conserved. 

12.7 Example: Projectile fired vertically upwards 

An upward fired projectile with initial velocities tọ = yg =0 and 2 = vo leads to the relations 

1 ; 
x= zost” cos À — wt?vo cos À 


y =0 
/ 1 2 
z =a + vot 


Solving for t when z' = 0 gives t = 0, and t = wo Also since the maximum height h that the projectile 
reaches is related by 
Vo = 2gh 


4 3 
z’ = —=w cos A oe 
3 V g 


12.8 Example: Motion parallel to Earth’s surface 


then the final deflection is 
Thus the body drifts westwards. 


For motion in the horizontal x' — y' plane the deflection is always to the right in the northern hemisphere 
of the Earth since the vertical component of w is upwards and thus —2W x y! points to the right. In the 
southern hemisphere the vertical component of w is downward and thus —2W x y points to the left. This 
is also shown using the above relations for the case of a projectile fired upwards in an easterly direction with 
components ip,0, zp. The resultant displacements are 


ee: 
t= gay cos A — wt? z cos A + pt 


Similarly, 
y! = —wigt sin À 
1 
g=- sot + Zt + wiht? cos A 
The trajectory is non-planar and, in the northern hemisphere, the projectile drifts to the right, that is 
southerly. 

In the battle of the River de la Plata, during World War 2, the gunners on the British light cruisers 
Exeter, Ajax and Achilles found that their accurately aimed salvos against the German pocket battleship Graf 
Spee were falling 100 yards to the left. The designers of the gun sighting mechanisms had corrected for the 
Coriolis effect assuming the ships would fight at latitudes near 50° north, not 50° south. 
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12.12 Weather systems 


Weather systems on Earth provide a classic example of motion in a rotating coordinate system. In the 
northern hemisphere, air flowing into a low-pressure region is deflected to the right causing counterclockwise 
circulation, whereas air flowing out of a high-pressure region is deflected to the right causing a clockwise 
circulation. Trade winds on the Earth result from air rising or sinking due to thermal activity combined 
with the Coriolis effect. Similar behavior is observed on other planets such as the Red Spot on Jupiter. 

For a fluid or gas, equation (12.36) can be written in terms of the fluid density p in the form 


pa” = -VP — pl2w x v” — w x (w x r')] (12.68) 


where the translational acceleration A, the gravitational force, and the azimuthal acceleration (Ww x r’) terms 
are ignored. The external force per unit volume equals the pressure gradient — V P while w is the rotation 
vector of the earth. 

In fluid flow, the Rossby number Ro is defined to be 


” 


inertial force a 
Ro = x 


Seapine oe 19, 
Coriolis force 2w x v” (12.69) 


For large dimensional pressure systems in the atmosphere, e.g. L ~ 1000km, the Rossby number is Ro ~ 0.1 
and thus the Coriolis force dominates and the radial acceleration can be neglected. This leads to a flow 
velocity v ~ 10m/s which is perpendicular to the pressure gradient VP, that is, the air flows horizontally 
parallel to the isobars of constant pressure which is called geostrophic flow. For much smaller dimension 
systems, such as at the wall of a hurricane, L ~ 50km, and v œ 50m/s, the Rossby number Ro ~ 10 and 
the Coriolis effect plays a much less significant role compared to the balance between the radial centrifugal 
forces and the pressure gradient. The same situation of the Coriolis forces being insignificant occurs for most 
small-scale vortices such as tornadoes, typical thermal vortices in the atmosphere, and for water draining a 
bath tub. 


12.12.1 Low-pressure systems: 


It is interesting to analyze the motion of air circulat- 
ing around a low pressure region at large radii where 
the motion is tangential. As shown in figure 12.8, 
a parcel of air circulating anticlockwise around the 
low with velocity v involves a pressure difference AP 
acting on the surface area S, plus the centrifugal and 
Coriolis forces. Assuming that these forces are bal- 
anced such that a” ~ 0, then equation 12.68 simpli- 
fies to 
v 1 
— = -VP — 2vwsin A (12.70) 
r p 
where the latitude A = 7—@. Thus the force equation 
can be written 


1dP v 
2 = — + 2uw sin A (12.71) 


It is apparent that the combined outward Coriolis Figure 12.8: Air flow and pressures around a low- 
force plus outward centrifugal force, acting on the pressure region. 
circulating air, can support a large pressure gradient. 

The tangential velocity v can be obtained by solving this equation to give 


dP 
v= cesma? + TF — rwsin À (12.72) 
Note that the velocity equals zero when r = 0 assuming that a is finite. That is, the velocity reaches a 
maximum at a radius 
1 1 dP 


Treakvel = ne (12.73) 


pwsin A dr 
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Figure 12.9: Hurricane Katrina over the Gulf of Mexico on 28 August 2005. [Published by the NOAA] 


which occurs at the wall of the eye of the circulating low-pressure system. 


Low pressure regions are produced by heating of air causing it to rise and resulting in an inflow of air 
to replace the rising air. Hurricanes form over warm water when the temperature exceeds 26°C and the 
moisture levels are above average. They are created at latitudes between 10° — 15° where the sea is warmest, 
but not closer to the equator where the Coriolis force drops to zero. About 90% of the heating of the air comes 
from the latent heat of vaporization due to the rising warm moist air condensing into water droplets in the 
cloud similar to what occurs in thunderstorms. For hurricanes in the northern hemisphere, the air circulates 
anticlockwise inwards. Near the wall of the eye of the hurricane, the air rises rapidly to high altitudes at 
which it then flows clockwise and outwards and subsequently back down in the outer reaches of the hurricane. 
Both the wind velocity and pressure are low inside the eye which can be cloud free. The strongest winds 
are in vortex surrounding the eye of the hurricane, while weak winds exist in the counter-rotating vortex of 
sinking air that occurs far outside the hurricane. 


Figure 12.9 shows the satellite picture of the hurricane Katrina, recorded on 28 August 2005. The eye of 
the hurricane is readily apparent in this picture. The central pressure was 90200N/m? (902mb) compared 
with the standard atmospheric pressure of 101300N/m? (1013mb). This 111mb pressure difference produced 
steady winds in Katrina of 280km/hr ( 175mph) with gusts up to 344km/hr which resulted in 1833 fatalities. 


Tornadoes are another example of a vortex low-pressure system that are the opposite extreme in both 
size and duration compared with a hurricane. Tornadoes may last only ~ 10 minutes and be quite small in 
radius. Pressure drops of up to 100mb have been recorded, but since they may only be a few 100 meters in 
diameter, the pressure gradient can be much higher than for hurricanes leading to localized winds thought to 
approach 500km/hr. Unfortunately, the instrumentation and buildings hit by a tornado often are destroyed 
making study difficult. Note that the the pressure gradient in small diameter of rope tornadoes is much 
more destructive than for larger 1/4 mile diameter tornadoes, which results in stronger winds. 
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12.12.2 High-pressure systems: 


In contrast to low-pressure systems, high-pressure systems are very different in that the Coriolis force points 
inward opposing the outward pressure gradient and centrifugal force. That is, 


2 1 dP 
E T A (12.74) 
r p dr 
which gives that 
dP 
v=rwsinA— 4| (rwsin A)? — ae (12.75) 
p dr 
This implies that the maximum pressure gradient plus centrifugal force supported by the Coriolis force is 
dP 
d < (rw sin A)? (12.76) 


As a consequence, high pressure regions tend to have weak pressure gradients and light winds in contrast 
to the large pressure gradients plus concomitant damaging winds possible for low pressure systems such a 
hurricanes or tornados. 

The circulation behavior, exhibited by weather patterns, also applies to ocean currents and other liquid 
flow on earth. However, the residual angular momentum of the liquid often can overcome the Coriolis terms. 
Thus often it will be found experimentally that water exiting the bathtub does not circulate anticlockwise in 
the northern hemisphere as predicted by the Coriolis force. This is because it was not stationary originally, 
but rotating slowly. 

Reliable prediction of weather is an extremely difficult, complicated and challenging task, which is of con- 
siderable importance in modern life. As discussed in chapter 16.8, fluid flow can be much more complicated 
than assumed in this discussion of air flow and weather. Both turbulent and laminar flow are possible. As a 
consequence, computer simulations of weather phenomena are difficult because the air flow can be turbulent 
and the transition from order to chaotic flow is very sensitive to the initial conditions. Typically the air 
flow can involve both macroscopic ordered coherent structures over a wide dynamic range of dimensions, 
coexisting with chaotic regions. Computer simulations of fluid flow often are performed based on Lagrangian 
mechanics to exploit the scalar properties of the Lagrangian. Ordered coherent structures, ranging from 
microscopic bubbles to hurricanes, can be recognized by exploiting Lyapunov exponents to identify the or- 
dered motion buried in the underlying chaos. Thus the techniques discussed in classical mechanics are of 
considerable importance outside of physics. 


12.13 Foucault pendulum 


A classic example of motion in non-inertial frames is the rotation of 
the Foucault pendulum on the surface of the earth. The Foucault 
pendulum is a spherical pendulum with a long suspension that os- 
cillates in the x — y plane with sufficiently small amplitude that the 
vertical velocity 2 is negligible. Assume that the pendulum is a sim- 
ple pendulum of length l and mass m as shown in figure 12.10. The 
equation of motion is given by 


T 
PART T ae (12.77) 


where T is the acceleration produced by the tension in the pendulum 
suspension and the rotation vector of the earth is designated by Q 
to avoid confusion with the oscillation frequency of the pendulum 
w. The effective gravitational acceleration g is given by 


g = go — N x [Q x (r+ R)] (12.78) 


that is, the true gravitational field go corrected for the centrifugal Figure 12.10: Foucault pendulum. 
force. 
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Assume the small angle approximation for the pendulum deflection angle 6, then T; = T cos 8 ~ T and 
T, = mg, thus T ~ mg. Then has shown in figure 12.10, the horizontal components of the restoring force 
are 


— m9 (12.79) 


T, = -mg” (12.80) 


Since g is vertical, and neglecting terms involving ż, then evaluating the cross product in equation (12.78) 
simplifies to 


g = -97 + 24/2cos 0 (12.81) 
j= -97 — 2èN cos 0 (12.82) 


where 0 is the colatitude which is related to the latitude A by 
cosÓ = sin À (12.83) 
The natural angular frequency of the simple pendulum is 


wo = 7 (12.84) 


while the z component of the earth's angular velocity is 
Q, = Qcosé (12.85) 
Thus equations 12.81 and 12.82 can be written as 


Ë — 20,4 + wee 
j—20,¿+wy = 0 (12.86) 


These are two coupled equations that can be solved by making a coordinate transformation. 
Define a new coordinate that is a complex number 


Nn=21+ iy (12.87) 
Multiply the second of the coupled equations 12.86 by 7 and add to the first equation gives 
(d+ if) +20, (t + iy) +0 (x + iy) =0 
which can be written as a differential equation for y 
ij + 2027 + won = 0 (12.88) 


Note that the complex number y contains the same information regarding the position in the x — y plane 
as equations 12.86. The plot of 7 in the complex plane, the Argand diagram, is a birds-eye view of the 
position coordinates (x,y) of the pendulum. This second-order homogeneous differential equation has two 
independent solutions that can be derived by guessing a solution of the form 


n(t) = Ae tot (12.89) 
Substituting equation 12.89 into 12.88 gives that 


a? — 20,0—w?2 =0 


a=0,+ 1/02 + we (12.90) 


That is 
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If the angular velocity of the pendulum wo >> Q, then 


a~ o, TE Wo (12.91) 
Thus the solution is of the form l l l 
n(t) = e 2+ (A et + A_ ett) (12.92) 
This can be written as 
n(t) = Ae~*+* cos(wot + 6) (12.93) 


where the phase ô and amplitude A depend on the initial conditions. Thus the plane of oscillation of the 
pendulum is defined by the ratio of the x and y coordinates, that is the phase angle ¿Q,t. This phase angle 
rotates with angular velocity Q, where 


Q, = OQcosé = Osin À (12.94) 


At the north pole the earth rotates under the pendulum with angular velocity Q and the axis of the 
pendulum is fixed in an inertial frame of reference. At lower latitudes, the pendulum precesses at the lower 
angular frequency Q, = Q sin À that goes to zero at the equator. For example, in Rochester, NY, A = 43° N, 
and therefore a Foucault pendulum precesses at Q, = 0.682Q. That is, the pendulum precesses 245.5° /day. 


12.14 Summary 


This chapter has focussed on describing motion in non-inertial frames of reference. It has been shown that the 
force and acceleration in non-inertial frames can be related using either Newtonian or Lagrangian mechanics 
by introducing additional inertial forces in the non-inertial reference frame. 


Translational acceleration of a reference frame In a primed frame, that is undergoing translational 
acceleration A, the motion in this non-inertial frame can be calculated by addition of an inertial force -mA, 
that leads to an equation of motion 

ma! =F-—mA (12.6) 


Note that the primed frame is an inertial frame if A = 0. 


Rotating reference frame It was shown that the time derivatives of a general vector G in both an 
inertial frame and a rotating reference frame are related by 


($) - ($) +wxG (12.16) 
dt fixed dt rotating 


where the w x G term originates from the fact that the unit vectors in the rotating reference frame are time 
dependent with respect to the inertial frame. 


Reference frame undergoing both rotation and translation Both Newtonian and Lagrangian me- 
chanics were used to show that for the case of translational acceleration plus rotation, the effective force in 
the non-inertial (double-primed) frame can be written as 


Fefe = ma" =F-—m(A+wxV+2%wxv"+wx(wxr)+wxr 12.28, 12.36 
ff 


These inertial correction forces result from describing the system using a non-inertial frame. These inertial 
forces are felt when in the rotating-translating frame of reference. Thus the notion of these inertial forces 
can be very useful for solving problems in non-inertial frames. For the case of rotating frames, two important 
inertial forces are the centrifugal force, —w x (w x r’), and the Coriolis force —2w x v”. 


Routhian reduction for rotating systems It was shown that for non-inertial systems, identical equa- 
tions of motion are derived using Newtonian, Lagrangian, Hamiltonian, and Routhian mechanics. 
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Terrestrial manifestations of rotation Examples of motion in rotating frames presented in the chapter 
included projectile motion with respect to the surface of the Earth, rotation alignment of nucleons in rotating 
nuclei, and weather phenomena. 


Workshop exercises 


1. 


Consider a fixed reference frame S and a rotating frame 5”. The origins of the two coordinate systems always 
coincide. By carefully drawing a diagram, derive an expression relating the coordinates of a point P in the two 
systems. (This was covered in Chapter 2, but it is worth reviewing now. 


. The effective force observed in a rotating coordinate system is given by equation 12.28. 


(a) What is the significance of each term in this expression? 
(b) Suppose you wanted to measure the gravitational force, both magnitude and direction, on a body of mass 


m at rest on the surface of the Earth. What terms in the effective force can be neglected? 


(c) Suppose you wanted to calculate the deflection of a projectile fired horizontally along the Earth’s surface. 
What terms in the effective force can be neglected? 


(d) Suppose you wanted to calculate the effective force on a small block of mass m placed on a frictionless 
turntable rotating with a time-dependent angular velocity w(t). What terms in the effective force can be 
neglected? 


A plumb line is carried along in a moving train, with m the mass of the plumb bob. Neglect any effects due to 
the rotation of the Earth and work in the noninertial frame of reference of the train. 


(a) Find the tension in the cord and the deflection from the local vertical if the train is moving with constant 
acceleration ao. 


(b) Find the tension in the cord and the deflection from the local vertical if the train is rounding a curve of 
radius p with constant speed vo. 


. A bead on a rotating rod is free to slide without friction. The rod has a length L and rotates about its end 


with angular velocity w. The bead is initially released from rest (relative to the rod) at the midpoint of the 
rod. 


(a) Find the displacement of the bead along the wire as a function of time. 
(b) Find the time when the bead leaves the end of the rod. 
(c) Find the velocity (relative to the rod) of the bead when it leaves the end of the rod. 


Here is a “thought experiment” for you to consider. Suppose you are in a small sailboat of mass M at the 
Earth’s equator. At the equator there is very little wind (this is known as the “equatorial doldrums” ), so your 
sailboat is, more or less, sitting still. You have a small anchor of mass m on deck and a single mast of height 
h in the middle of the boat. How can you use the anchor to put the boat into motion? In which direction will 
the boat move? 


Does water really flow in the other direction when you flush a toilet in the southern hemisphere? What (if 
anything) does the Coriolis force have to do with this? 


We are presently at a latitude (with respect to the equator) and Earth is rotating with constant angular 
velocity w. Consider the following two scenarios: Scenario A: A particle is thrown upward with initial speed 
vg. Scenario B: An identical particle is dropped (at rest) from the maximum height of the particle in Scenario 
A. Circle all the true statements regarding the Coriolis deflection assuming that the particles have landed for 
a) and b), . 


a) The magnitude is greater in A than in B. 
b) The direction in A and B are the same. 


( 
( 


(c) The direction in A does not change throughout flight. 
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Problems 


1. Ifa projectile is fired due east from a point on the surface of the Earth at a northern latitude ÀA with a velocity 
of magnitude V and at an inclination to the horizontal of a, show that the lateral deflection when the projectile 
strikes the Earth is 3 

4 afk 
d= —w sin À sin? a cosa (12.95) 
g 
where w is the rotation frequency of the Earth. 


2. Obtain an expression for the angular deviation of a particle projected from the North Pole in a path that lies 
close to the surface of the earth. Is the deviation significant for a missile that makes a 4800-km flight in 10 
minutes? What is the ”miss distance” if the missile is aimed directly at the target? Is the miss difference 
greater for a 19300-km flight at the same velocity? 


3. An automobile drag racer drives a car with acceleration a and instantaneous velocity v. The tires of radius ro 
are not slipping. Derive which point on the tire has the greatest acceleration relative to the ground. What is 
this acceleration? 


4. Shot towers were popular in the eighteenth and nineteenth centuries for dropping melted lead down tall towers 
to form spheres for bullets. The lead solidified while falling and often landed in water to cool the lead bullets. 
Many such shot towers were built in New York State. Assume a shot tower was constructed at latitude 42° N, 
and that the lead fell a distance of 27m. In what direction and by how far did the lead bullets land from the 
direct vertical? 


Chapter 13 


Rigid-body rotation 


13.1 Introduction 


Rigid-body rotation features prominently in science, engineering, and sports. Prior chapters have focussed 
primarily on motion of point particles. This chapter extends the discussion to motion of finite-sized rigid 
bodies. A rigid body is a collection of particles where the relative separations remain rigidly fixed. In real 
life, there is always some motion between individual atoms, but usually this microscopic motion can be 
neglected when describing macroscopic properties. Note that the concept of perfect rigidity has limitations 
in the theory of relativity since information cannot travel faster than the velocity of light, and thus signals 
cannot be transmitted instantaneously between the ends of a rigid body which is implied if the body had 
perfect rigidity. 

The description of rigid-body rotation is most easily handled by specifying the properties of the body 
in the rotating body-fixed coordinate frame whereas the observables are measured in the stationary iner- 
tial laboratory coordinate frame. In the body-fixed coordinate frame, the primary observable for classical 
mechanics is the inertia tensor of the rigid body which is well defined and independent of the rotational 
motion. By contrast, in the stationary inertial frame the observables depend sensitively on the details of the 
rotational motion. For example, when observed in the stationary fixed frame, rapid rotation of a long thin 
cylindrical pencil about the longitudinal symmetry axis gives a time-averaged shape of the pencil that looks 
like a thin cylinder, whereas the time-averaged shape is a flat disk for rotation about an axis perpendicular 
to the symmetry axis of the pencil. In spite of this, the pencil always has the same unique inertia tensor 
in the body-fixed frame. Thus the best solution for describing rotation of a rigid body is to use a rotation 
matrix that transforms from the stationary fixed frame to the instantaneous body-fixed frame for which the 
moment of inertia tensor can be evaluated. Moreover, the problem can be greatly simplified by transforming 
to a body-fixed coordinate frame that is aligned with any symmetry axes of the body since then the inertia 
tensor can be diagonal; this is called a principal axis system. 

Rigid-body rotation can be broken into the following two classifications. 

1) Rotation about a fixed axis: 

A body can be constrained to rotate about an axis that has a fixed location and orientation relative to 
the body. The hinged door is a typical example. Rotation about a fixed axis is straightforward since the 
axis of rotation, plus the moment of inertia about this axis, are well defined and this case was discussed in 
chapter 2.12.7. 

2) Rotation about a point 

A body can be constrained to rotate about a fixed point of the body but the orientation of this rotation 
axis about this point is unconstrained. One example is rotation of an object flying freely in space which can 
rotate about the center of mass with any orientation. Another example is a child’s spinning top which has 
one point constrained to touch the ground but the orientation of the rotation axis is undefined. 

The prior discussion in chapter 2.12.7 showed that rigid-body rotation is more complicated than assumed 
in introductory treatments of rigid-body rotation. It is necessary to expand the concept of moment of inertia 
to the concept of the inertia tensor, plus the fact that the angular momentum may not point along the 
rotation axis. The most general case requires consideration of rotation about a body-fixed point where the 
orientation of the axis of rotation is unconstrained. The concept of the inertia tensor of a rotating body is 
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crucial for describing rigid-body motion. It will be shown that working in the body-fixed coordinate frame of 
a rotating body allows a description of the equations of motion in terms of the inertia tensor for a given point 
of the body, and that it is possible to rotate the body-fixed coordinate system into a principal axis system 
where the inertia tensor is diagonal. For any principal axis, the angular momentum is parallel to the angular 
velocity if it is aligned with a principal axis. The use of a principal axis system greatly simplifies treatment 
of rigid-body rotation and exploits the powerful and elegant matrix algebra mentioned in appendix A. 

The following discussion of rigid-body rotation is broken into three topics, (1) the inertia tensor of the 
rigid body, (2) the transformation between the rotating body-fixed coordinate system and the laboratory 
frame, i.e., the Euler angles specifying the orientation of the body-fixed coordinate frame with respect to the 
laboratory frame, and (3) Lagrange and Euler’s equations of motion for rigid-bodies. This is followed by a 
discussion of practical applications. 


13.2 Rigid-body coordinates 


Motion of a rigid body is a special case for motion of the N-body system when the relative positions of 
the N bodies are related. It was shown in chapter 2 that the motion of a rigid body can be broken into 
a combination of a linear translation of some point in the body, plus rotation of the body about an axis 
through that point. This is called Chasles’ Theorem. Thus the position of every particle in the rigid body 
is fixed with respect to one point in the body. If the fixed point of the body is chosen to be the center of 
mass, then, as discussed in chapter 2, it is possible to separate the kinetic energy, linear momentum, and 
angular momentum into the center-of-mass motion, plus the motion about the center of mass. Thus the 
behavior of the body can be described completely using only six independent coordinates governed by six 
equations of motion, three for translation and three for rotation. 
Referred to an inertial frame, the translational motion of the center of mass is governed by 


dP 
Fe = — 13.1 
H (13.1) 
while the rotational motion about the center of mass is determined by 
dL 
NZ = — 13.2 
a (13.2) 


where the external force FË and external torque NY are identified separately from the internal forces acting 
between the particles in the rigid body. It will be assumed that the internal forces are central and thus do 
not contribute to the angular momentum. 

The location of any fixed point in the body, such as the center of mass, can be specified by three generalized 
cartesian coordinates with respect to a fixed frame. The rotation of the body-fixed axis system about this 
fixed point in the body can be described in terms of three independent angles with respect to the fixed frame. 
There are several possible sets of orthogonal angles that can be used to describe the rotation. This book 
uses the Euler angles ¢, 0,7 which correspond first to a rotation ¢ about the z-axis, then a rotation 6 about 
the x axis subsequent to the first rotation, and finally a rotation y about the new z axis following the first 
two rotations. The Euler angles will be discussed in detail following introduction of the inertia tensor and 
angular momentum. 


13.3 Rigid-body rotation about a body-fixed point 


With respect to some point O fixed in the body coordinate system, the angular momentum of the body a is 


given by 


There are two especially convenient choices for the fixed point O. If no point in the body is fixed with 
respect to an inertial coordinate system, then it is best to choose O as the center of mass. If one point of 
the body is fixed with respect to a fixed inertial coordinate system, such as a point on the ground where a 
child’s spinning top touches, then it is best to choose this stationary point as the body-fixed point O. 
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Consider a rigid body composed of N particles of mass 
Ma Where a = 1,2,3,...N. As discussed in chapter 12.4, if the 
body rotates with an instantaneous angular velocity w about 
some fixed point, with respect to the body-fixed coordinate 
system, and this point has an instantaneous translational ve- 
locity V with respect to the fixed (inertial) coordinate system, 
see figure 13.1, then the instantaneous velocity vq of the at” 
particle in the fixed frame of reference is given by 


Vo =V +v” +w xr, (13.4) 

However, for a rigid body, the velocity of a body-fixed point 
with respect to the body is zero, that is v”, = 0, thus 

Va =V +w xr, (13.5) 


Consider the translational velocity of the body-fixed point 
O to be zero, i.e. V = 0 and let R = 0, then ra =r’, . These 
assumptions allow the linear momentum of the particle a to 
be written as 


Pa = Ma Va = MaW X ra (13.6) 
Figure 13.1: Infinitessimal displacement dr’, 
= A in the primed frame, broken into a part dr? 
due to rotation of the primed frame plus a 
L= = x x 13. 7 
x Ta X Pa 2 Mara Es) gag part dr”due to displacement with respect to 
Q 
this rotating frame. 


Therefore 


Using the vector identity 
Ax(BxA)=4°B-A(A-B) 


leads to 
N 


L = Some [r2.W — ra (ra: w)| (13.8) 

a 
The angular momentum can be expressed in terms of components of w and r’, relative to the body-fixed 
frame. The following formulae can be written more compactly if ra = (a, Ya, Za), in the rotating body-fixed 


frame, is written in the form ra = (Za 1, Ta,2, Ta,3) where the axes are defined by the numbers 1, 2,3 rather 
than x,y,z. In this notation, the angular momentum is written in component form as 


N 
L; = SN ma DI — Lai Sag (13.9) 
a k j 


Assume the Kronecker delta relation 


3 
Wi = X 095i (13.10) 
E 
where 
di; =, 1 a =J 
dj = 0 Aj 


Substitute (13.10) in (13.9) gives 


N 
Li = Ma w diy Tak = Wjla ita, j 
a k 


j 


3 N 
Xoj Y Ma (a, Xolak — east) (13.11) 
j Q k 
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13.4 Inertia tensor 


The square bracket term in (13.11) is called the moment of inertia tensor I which is usually referred to 


as the inertia tensor re 2 
Ty = ma R (> “| - e) (13.12) 
a k 


In most cases it is more useful to express the components of the inertia tensor in an integral form over 
the mass distribution rather than a summation for N discrete bodies. That is, 


3 
Ij = fee) (a bs 2) = za) dV (13.13) 
k 


The inertia tensor is easier to understand when written in cartesian coordinates r/, = (La, Ya; Za) rather 
than in the form r% = (%a1, 2,2; Za,3). Then, the diagonal moments of inertia of the inertia tensor are 


N 
Ion = Some [x2 + Ye + z2 — x] = Ma Ye +2] (13.14) 
Q 


N 

Q 

N N 
— eae 2 2 749 = P,,2 2] 
Lyy = X Ma ¡Ta T Ya TaT Ya] = X Ma |La T Za] 

a a 

N N 

> [2 2: 2 2] 
Lee = > Ma [To + Ya + 2% Za = J 
a a 


while the off-diagonal products of inertia are 
E oe - Y ma LaYal (13.15) 


Lex = Izz = -X ma Taža] 


Iy = Lya=- Y Mo [yoza] 


Note that the products of inertia are symmetric in that 
Lij = Iji (13.16) 


The above notation for the inertia tensor allows the angular momentum (13.12) to be written as 


3 
J 
Expanded in cartesian coordinates 
Ly = LuaWwWa + LoyWy + Ipzwz (13.18) 
Ly = LysWa + 1IyyWy + Iyzwz 
Lz = Loca TF L¿yWy ER Izzwz 


Note that every fixed point in a body has a specific inertia tensor. The components of the inertia tensor 
at a specified point depend on the orientation of the coordinate frame whose origin is located at the specified 
fixed point. For example, the inertia tensor for a cube is very different when the fixed point is at the center 
of mass compared with when the fixed point is at a corner of the cube. 
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13.5 Matrix and tensor formulations of rigid-body rotation 


The prior notation is clumsy and can be streamlined by use of matrix methods. Write the inertia tensor in 
a matrix form as 


Li h2 hs 
{I}=| la Le hs (13.19) 
Isı I32 Ig 


The angular velocity and angular momentum both can be written as a column vectors, that is 


Wy Li 
w = W2 L= Lo (13.20) 
w3 Ls 


As discussed in appendix £2, equation (13.18) now can be written in tensor notation as an inner product 
of the form 


LS Mois (13.21) 


Note that the above notation uses boldface for the inertia tensor I, implying a rank-2 tensor representation, 
while the angular velocity w and the angular momentum L are written as column vectors. The inertia tensor 
is a 9-component rank-2 tensor defined as the ratio of the angular momentum vector L and the angular 
velocity w. 


m= L (13.22) 


Note that, as described in appendix E, the inner product of a vector w, which is the rank 1 tensor, and a 
rank 2 tensor {I} , leads to the vector L. This compact notation exploits the fact that the matrix and tensor 
representation are completely equivalent, and are ideally suited to the description of rigid-body rotation. 


13.6 Principal axis system 


The inertia tensor is a real symmetric matrix because of the symmetry given by equation (13.16). A property 
of real symmetric matrices is that there exists an orientation of the coordinate frame, with its origin at the 
chosen body-fixed point O, such that the inertia tensor is diagonal. The coordinate system for which the 
inertia tensor is diagonal is called the Principal axis system which has three perpendicular principal 
axes. Thus, in the principal axis system, the inertia tensor has the form 


hi 0 0 
{=| 0 la 0 (13.23) 
0 0 Is3 


where J;; are real numbers, which are called the principal moments of inertia of the body, and are 
usually written as J;. When the angular velocity vector w points along any principal axis unit vector j, then 
the angular momentum L is parallel to w and the magnitude of the principal moment of inertia about this 
principal axis is given by the relation 


Lj = Ljw5) (13.24) 


The principal axes are fixed relative to the shape of the rigid body and they are invariant to the orientation 
of the body-fixed coordinate system used to evaluate the inertia tensor. The advantage of having the body- 
fixed coordinate frame aligned with the principal axis coordinate frame is that then the inertia tensor is 
diagonal, which greatly simplifies the matrix algebra. Even when the body-fixed coordinate system is not 
aligned with the principal axis frame, if the angular velocity is specified to point along a principal axis then 
the corresponding moment of inertia will be given by (13.24). 

In principle it is possible to locate the principal axes by varying the orientation of the angular velocity 
vector w to find those orientations for which the angular momentum L and angular velocity w are parallel 
which characterizes the principal axes. However, the best approach is to diagonalize the inertia tensor. 
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13.7 Diagonalize the inertia tensor 


Finding the three principal axes involves diagonalizing the inertia tensor, which is the classic eigenvalue 
problem discussed in appendix A. Solution of the eigenvalue problem for rigid-body motion corresponds to 
a rotation of the coordinate frame to the principal axes resulting in the matrix 


{I} -w = Iw (13.25) 


where I comprises the three-valued eigenvalues, while the corresponding vector w is the eigenvector. Ap- 
pendix A.4 gives the solution of the matrix relation 


-w= w (13.26) 


where J are three-valued eigen values for the principal axis moments of inertia, and {I} is the unity tensor, 
equation 4.2.4. 


10 0 
{}=2 0 1 0 (13.27) 
00 4 


Rewriting (13.26) gives 
aD — 141)) w =0 (13.28) 


This is a matrix equation of the form A -w =0 where A is a 3 x 3 matrix and w is a vector with values 
Wa, Wy,Wz. The matrix equation A -w =0 really corresponds to three simultaneous equations for the three 
numbers wg, wy,wz. It is a well-known property of equations like (13.28) that they have a non-zero solution 
if, and only if, the determinant det(A) is zero, that is 


det(I-1I)=0 (13.29) 


This is called the characteristic equation, or secular equation for the matrix T. The determinant 
involved is a cubic equation in the value of J that gives the three principal moments of inertia. Inserting 
one of the three values of I into equation (13.17) gives the corresponding eigenvector w. Applying the above 
eigenvalue problem to rigid-body rotation corresponds to requiring that some arbitrary set of body-fixed 
axes be the principal axes of inertia. This is obtained by rotating the body-fixed axis system such that 


Lı = Iaw +w + 11303 = Iwi (13.30) 
Ly = Iaw + In2w2 + Ip3wW3 = [wa 
L3 =  fziw1 + [32092 + [33wW3 = Iwg 


or 


(i —D) w1 + 11209 + 11303 = 


0 (13.31) 
Lo1w1 + Uo2 — I) wo + Io3w3 = 0 
0 


[3,1 + Izgw2 + (l33 —I)w3 = 


These equations have a non-trivial solution for the ratios w; : wa : w3 since the determinant vanishes, that is 


(lı) Lo Iis 
Iı (22 1) la =0 (13.32) 
[31 [32 (L33 — I) 


The expansion of this determinant leads to a cubic equation with three roots for J. This is the secular 
equation for J whose eigenvalues are the principal moments of inertia. 

The directions of the principal axes, that is the eigenvectors, can be found by substituting the cor- 
responding solution for I into the prior equation. Thus for eigensolution J, the eigenvector is given by 
solving 


(Li = I) w11 + liwa + fi3031 = 0 (13.33) 
liwi + (D2 — I1) w21 + 23w31 


13141 + L32021 + 33 — 1)w3sı = 0 


l 
o 
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These equations are solved for the ratios w11 : w21 : w31 which are the direction numbers of the principle axis 
system corresponding to solution J;. This principal axis system is defined relative to the original coordinate 
system. This procedure is repeated to find the orientation of the other two mutually perpendicular principal 
axes. 


13.8 Parallel-axis theorem 


The values of the components of the inertia tensor depend on both the 
location and the orientation about which the body rotates relative to 
the body-fixed coordinate system. The parallel-axis theorem is valuable 
for relating the inertia tensor for rotation about parallel axes passing 
through different points fixed with respect to the rigid body. For ex- 
ample, one may wish to relate the inertia tensor through the center of 
mass to another location that is constrained to remain stationary, like 
the tip of the spinning top. 

Consider the mass a at the location r = (11,12,13) with respect 
to the origin of the center of mass body-fixed coordinate system O. 
Transform to an arbitrary but parallel body-fixed coordinate system 
Q, that is, the coordinate axes have the same orientation as the center 
of mass coordinate system. The location of the mass œ with respect 
to this arbitrary coordinate system is R = (X1, X2, X3). That is, the 
general vectors for the two coordinates systems are related by 


R=a+r (13.34) 


Figure 13.2: Transformation be- 
where a is the vector connecting the origins of the coordinate systems tween two parallel body-coordinate 
O and Q illustrated in figure 13.2. The elements of the inertia tensor systems, O and Q. 
with respect to axis system Q, are given by equation 13.12 to be 


N 3 
Jij = Y ma R bs xis] = aia (13.35) 
a k 
The components along the three axes for each of the two coordinate systems are related by 
Xi = atti (13.36) 


Substituting these into the above inertia tensor relation gives 


3 


N 
Y Ma fos (>: (Zak + a) — (Lay + ai) (£a j + ai) 


k 
N 3 
2 
XO ma |ô | > tZ | — Zaita,j 
Q k 


The first summation on the right-hand side corresponds to the elements J;; of the inertia tensor in the 
center-of-mass frame. Thus the terms can be regrouped to give 


N 3 N 3 
Jij = Lij + Site (a 5 az = vs) + Nula 2%, S takar — AjLa,j — crs (13.38) 
a k a k 


However, each term in the last bracket involves a sum of the form ae Matak: Take the coordinate system 
O to be with respect to the center of mass for which 


Jij (13.37) 


3 


N 
+ SN ma R (> (230 kak + a) — (¡La ¿+ aj£a,i + aa) 


k 


N 
SN mar =0 (13.39) 
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This also applies to each component k, that is 


So Mazak = 0 (13.40) 


Therefore all of the terms in the last bracket cancel leaving 


N 3 
Jij = Li; + Na (a; Saz =- 005) (13.41) 
Q k 


But ma = M and Y; a? = a?, thus 
Jij = Lig + M (025; — ajaj) (13.42) 


where J;; is the center-of-mass inertia tensor. This is the general form of Steiner’s parallel-axis theorem. 
As an example, the moment of inertia around the X; axis is given by 


Ji = Ty +M (la? + as + a3) 011 = az) = Li +M (as + a3) (13.43) 


which corresponds to the elementary statement that the difference in the moments of inertia equals the 
mass of the body multiplied by the square of the distance between the parallel axes, x1, X1. Note that the 
minimum moment of inertia of a body is [;; which is about the center of mass. 


13.1 Example: Inertia tensor of a solid cube rotating about the center of mass. 


The complicated expressions for the inertia tensor can be un- 
derstood using the example of a uniform solid cube with side b, 
density p, and mass M = pb?, rotating about different axes. As- X; T3 
sume that the origin of the coordinate system O is at the center 
of mass with the azes perpendicular to the centers of the faces of 
the cube. 

The components of the inertia tensor can be calculated using 
(13.13) written as an integral over the mass distribution rather 
than a summation. 


3 
iy = (r’) di; xe — Lit; dV 
fter(ss(24) =e) 


Thus ; ; ; 
Inertia tensor of a uniform solid cube of 
T = b/2  fb/2: pb/2 EN jä dzd side b about the center of mass O and a 
Hy an -y2 0/2 J—»/2 (2 +23) dradrada, corner of the cube Q. The vector a is the 


vector distance between O and Q. 
= pb = Imo? = Ing = 133 


By symmetry the diagonal moments of inertia about each face 
are identical. Similarly the products of inertia are given by 


b/2 b/2 b/2 
Lio = = -o | I / (1122) ) drg3dzədzı = = 0 
b/2 b/2 b/2 


Thus the inertia tensor is given by 


Note that this inertia tensor is diagonal implying that this is the principal axis system. In this case all three 
principal moments of inertia are identical and perpendicular to the centers of the faces of the cube. This is 
as expected from the symmetry of the cubic geometry. 
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13.2 Example: Inertia tensor of about a corner of a solid cube. 


a) Direct calculation Let one corner of the cube be the origin of the coordinate system Q and assume 
that the three adjacent sides of the cube lie along the coordinate axes. The components of the inertia tensor 
can be calculated using (13.13). Thus 


b pb pb 2 2 
hi = of f | (23 + 23) dezdu2da, = ¿pb? = Mb? 
0 JO JO 3 3 


b pb pb 1 1 
ho = -o | f f (xz1£2)dzzdzrədzı = — -pb = ——Mb? 
o Jo Jo 4 4 


Thus, evaluating all the nine components gives 


1 8 —3 —3 
Tera MO =3 8 -3 
-3 -3 8 


b) Parallel-axis theorem This inertia tensor also can be calculated using the parallel-axis theorem to 
relate the moment of inertia about the corner, to that at the center of mass. As shown in the figure, the 
vector a has components 


oe es asi b 
ay = a2 = Q3 = 3 
Applying the parallel-axis theorem gives 
1 1 2 
Jı = hı +M (a? — a?) = hı +M (a3 +a?) = zM” + zM” = ¿M0 
and similarly for Jog and J33. The off-diagonal terms are given by 


1 
J2 = h2 + M (-a1a2) = MP 


Thus the inertia tensor, transposed from the center of mass, to the corner of the cube is 


¿MP ME ¿M0 i 8 -3 3 
pomor | -aM ¿MY ¿MP | = ¿MP | -3 8 -3 
-Mè iM? 2M8? -3 -3 8 


This inertia tensor about the corner of the cube, is the same as that obtained by direct integration. 


c) Principal moments of inertia The coordinate axis frame used for rotation about the corner of the 
cube is not a principal axis frame. Therefore let us diagonalize the inertia tensor to find the principal 
axis frame the principal moments of inertia about a corner. To achieve this requires solving the secular 
determinant 


(4M -I) -Mb -1 Mb? 
-1 Mb? (3M0?—I) —¿Mb? =0 
-4Mb? -4 Mb? (¿Mb? — I) 


The value of a determinant is not affected by adding or subtracting any row or column from any other 
row or column. Subtract row 1 from row 2 gives 


(¿UE — 1) M8 -1M0? 
“Mv +1 (M8 1) 0 =0 
uv? -1 Mp? (2M? — I) 


The determinant of this matrix is straightforward to evaluate and equals 


Tesi iiano Tie 7 
(gan 1) (Me 1) (Gan r)=0 
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Thus the roots are 


Lue? 0 0 
yoorner — 0 H Mb? 0 
0 0 H Mb? 


The identical roots Ing = I33 = H Mb? imply that the principal axis associated with [1 must be a symmetry 
axis. The orientation can be found by substituting Iıı into the above equation 


1 6 —3 —3 (011 
({I} -IM):w==MY| -3 6 -3 wo |=0 
12 
=3 -3 6 W31 


where the second subscript 1 attached to w; signifies that this solution corresponds to I1. This gives 


2w — w21 -w31 = 0 
—w11 + 2w21 — w31 = 


—W 1 — w21 + 2w31 = 0 


Solving these three equations gives the unit vector for the first principal axis for which lı = ¿M b? to be 


1 
ê=- 1 |. This can be repeated to find the other two principal axes by substituting I22 = H Mb. This 
1 
gives for the second principal moment Izz 
1 =3 -3 -3 W12 
UD -I{}-w==Mè?| -3 -3 -3 wz | =0 
12 
—3 -3 —3 W32 


This results in three identical equations for the components of w but all three equations are the same, namely 
w12 + was + w32 = 0 


This does not uniquely determine the direction of w. However, it does imply that w2 corresponding to the 
second principal axis has the property that 
w-é = 0 


that is, any direction of êz that is perpendicular to €, is acceptable. In other words; any two orthogonal unit 
vectors €2 and êz that are perpendicular to €, are acceptable. This ambiguity exists whenever two eigenvalues 
are equal; the three principal axes are only uniquely defined if all three eigenvalues are different. The same 
ambiguity exist when all three eigenvalues are identical as occurs for the principal moments of inertia about 
the center-of-mass of a uniform solid cube. This explains why the principal moment of inertia for the diagonal 
of the cube, that passes through the center of mass, has the same moment as when the principal axes pass 
through the center of the faces of the cube. 


13.9 Perpendicular-axis theorem for plane laminae 


Rigid-body rotation of thin plane laminae objects is encountered frequently. Examples of such laminae 
bodies are a plane sheet of metal, a thin door, a bicycle wheel, a thin envelope or book. Deriving the inertia 
tensor for a plane lamina is relatively simple because there are limits on the possible relative magnitude 
of the principal moments of inertia. Consider that the principal axis are along the x, y, z, coordinate axes. 
Then the sum of two principal moments of inertia about the center of mass are 


I+II} = [ow +27)dV + | ole? + 2)aVv 


II 
AL 
> 
A 
8 
N 
| 


tL y?)dV + 2 | pav > fre +y°)dV =I, (13.44) 
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Note that for any body the three principal moments of inertia must satisfy the triangle rule that the sum of 
any pair must exceed or equal the third. Moreover, if the body is a thin lamina with thickness z = 0, that 
is, a thin plate in the x — y plane, then 

I, +l = L, (13.45) 


This perpendicular-axis theorem can be very useful for solving problems involving rotation of plane laminae. 

The opposite of a plane laminae is a long thin cylindrical needle of mass m, length L, and radius r. 
Along the symmetry axis the principal moments are I, = imr? — 0 as r — 0, while perpendicular to the 
symmetry axis I, = Iy = ¿ml?. These satisfy the triangle rule. 


13.3 Example: Inertia tensor of a hula hoop 


The hula hoop is a thin plane circular ring or radius R and mass M. Assume that the symmetry axis of 
the circular ring is the 3 axis. 

a) The principal moments of inertia about the center of mass: The principal moment of inertia along the 
3 axis is I33 = MR?. Then equation 13.45 plus symmetry tells us that the two principal moments of inertia 
in the plane of the hula hoop must be Iii = Ing = 3MR?. 

b) The principal moments of inertia about the periphery of the ring: Using the Parallel-axis theorem 
tells us that the moment perpendicular to the plane of the hula hoop I33 = 2MR?. In the plane of the hoop 
the moment tangential to the hoop is yy = ¿MR? and the moment radial to the hoop I2 = 4MR?. The 
hula dancer often swings the hoop about the periphery and perpendicular to the plane by swinging their hips. 
Another movement is jumping through the hoop by rotating the hoop tangential to the periphery. Calculation 
of such maneuvers requires knowledge of these principal moments of inertia. 


13.4 Example: Inertia tensor of a thin book 


Consider a thin rectangular book of mass M, width a and length b with thickness t << a and t << b. 
About the center of mass the inertia tensor perpendicular to the plane of the book is I33 = Ka +b*). The 
other two moments are I, = La? and Ing = yb? which satisfy equation 13.45. 


13.10 General properties of the inertia tensor 


13.10.1 Inertial equivalence 


The elements of the inertia tensor, the values of the principal moments of inertia, and the orientation of the 
principal axes for a rigid body, all depend on the choice of origin for the system. Recall that for the kinetic 
energy to be separable into translational and rotational portions, the origin of the body coordinate system 
must coincide with the center of mass of the body. However, for any choice of the origin of any body, there 
always exists an orientation of the axes that diagonalizes the inertia tensor. 

The inertial properties of a body for rotation about a specific body-fixed location is defined completely 
by only three principal moments of inertia irrespective of the detailed shape of the body. As a result, the 
inertial properties of any body about a body-fixed point are equivalent to that of an ellipsoid that has the 
same three principal moments of inertia. The symmetry properties of this equivalent ellipsoidal body define 
the symmetry of the inertial properties of the body. If a body has some simple symmetry then usually it is 
obvious as to what will be the principal axes of the body. 


Spherical top: l = I2 = 13 


A spherical top is a body having three degenerate principal moments of inertia. Such a body has the same 
symmetry as the inertia tensor about the center of a uniform sphere. For a sphere it is obvious from the 
symmetry that any orientation of three mutually orthogonal axes about the center of the uniform sphere are 
equally good principal axes. For a uniform cube the principal axes of the inertia tensor about the center of 
mass were shown to be aligned such that they pass through the center of each face, and the three principal 
moments are identical; that is, inertially it is equivalent to a spherical top. A less obvious consequence of the 
spherical symmetry is that any orientation of three mutually perpendicular axes about the center of mass of 
a uniform cube is an equally good principal axis system. 
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Symmetric top: l = Iz 4 Ís 
The equivalent ellipsoid for a body with two degenerate principal moments of inertia is a spheroid which has 
cylindrical symmetry with the cylindrical axis aligned along the third axis. A body with /3 < h = Ip isa 
prolate spheroid while a body with /3 > I, = Ig is an oblate spheroid. Examples with a prolate spheroidal 
equivalent inertial shape are a rugby ball, pencil, or a baseball bat. Examples of an oblate spheroid are an 
orange, or a frisbee. A uniform sphere, or a uniform cube, rotating about a point displaced from the center- 
of-mass also behave inertially like a symmetric top. The cylindrical symmetry of the equivalent spheroid 
makes it obvious that any mutually perpendicular axes that are normal to the axis of cylindrical symmetry 
are equally good principal axes even when the cross section in the 1—2 plane is square as opposed to circular. 
A rotor is a diatomic-molecule shaped body which is a special case of a symmetric top where J; = 0, 
and Iz = I3. The rotation of a rotor is perpendicular to the symmetry axis since the rotational energy and 
angular momentum about the symmetry axis are zero because the principal moment of inertia about the 
symmetry axis is zero. 


Asymmetric top: l 4 Ip 4 Is 
A body where all three principal moments of inertia are distinct, Iı 4 Ig Æ Is, is called an asymmetric 


top. Some molecules, and nuclei have asymmetric, triaxially-deformed, shapes. 


13.10.2 Orthogonality of principal axes 


The body-fixed principal axes comprise an orthogonal set, for which the vectors L and w are simply related. 
Components of L and w can be taken along the three body-fixed axes denoted by i. Thus for the mt” 
principal moment Im 


Written in terms of the inertia tensor 
3 
Lim = 5 LlikWkm = ImWim (13.47) 
k 


Similarly the nt? principal moment can be written as 


3 
Lkn = y IkiWin T Inwkn (13.48) 


Multiply the equation 13.47 by win and sum over i gives 


5 LikWkmWin =~ x ImmWimWin (13.49) 
i,k 2 


Similarly multiplying equation 13.48 by wkm and summing over k gives 


5 IkiWkmWin = y InnWkmWkn (13.50) 
i,k k 


The left-hand sides of these equations are identical since the inertia tensor is symmetric, that is Ij, = Iki. 
Therefore subtracting these equations gives 


5 ImmWimWin = 5 InnWkmWkn =0 (13.51) 
i k 


That is 
(Imm > Inn) S WkmWkn =0 (13.52) 
k 


or 
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If Im A In then 
Wm Wn =0 (13.54) 


which implies that the m and n principal axes are perpendicular. However, if Imm = Inn then equation 
13.53 does not require that wm : wn = 0, that is, these axes are not necessarily perpendicular, but, with 
no loss of generality, these two axes can be chosen to be perpendicular with any orientation in the plane 
perpendicular to the symmetry axis. 

Summarizing the above discussion, the inertia tensor has the following properties. 

1) Diagonalization may be accomplished by an appropriate rotation of the axes in the body. 

2) The principal moments (eigenvalues) and principal axes (eigenvectors) are obtained as roots of the 
secular determinant and are real. 

3) The principal axes (eigenvectors) are real and orthogonal. 

4) For a symmetric top with two identical principal moments of inertia, any orientation of two orthogonal 
axes perpendicular to the symmetry axis are satisfactory eigenvectors. 

5) For a spherical top with three identical principal moment of inertia, the principal axes system can 
have any orientation with respect to the origin. 


13.11 Angular momentum L and angular velocity w vectors 


The angular momentum is a primary observable for rotation. As discussed in chapter 13.5, the angular 
momentum L is compactly and elegantly written in matrix form using the tensor algebra relation 


hı h2 hs Wy 
L= In, Lo L3 e wa = {I} OA (13.55) 
Isı l32 133 w3 


where w is the angular velocity, {I} the inertia tensor, and L the corresponding angular momentum. 
Two important consequences of equation 13.55 are that: 


e The angular momentum L and angular velocity w are not necessarily colinear. 


e In general the Principal axis system of the rotating rigid body is not aligned with either the angular 
momentum or angular velocity vectors. 


An exception to these statements occurs when the angular velocity w is aligned along a principal axes 
for which the inertia tensor is diagonal, i.e. [;; = [;6;;, and then both L and w point along this principal 
axis. In general the angular momentum L and angular velocity w precess around each other. An important 
special case is for torque-free systems where Noether’s theorem implies that the angular momentum vector 
L is conserved both in magnitude and amplitude. In this case, the angular velocity w, and the Principal axis 
system, both precesses around the angular momentum vector L. That is, the body appears to tumble with 
respect to the laboratory fixed frame. Understanding rigid-body rotation requires care not to confuse the 
body-fixed Principal axis coordinate frame, used to determine the inertia tensor, and the fixed laboratory 
frame where the motion is observed. 


13.5 Example: Rotation about the center of mass of a solid cube 


It is illustrative to use the inertia tensors of a uniform cube to compute the angular momentum for any 
applied angular velocity vector w using equation (13.55). If the angular velocity is along the x axis, then 
using the inertia tensor for a solid cube, derived earlier, in equation (13.55) gives the angular momentum to 


j 10 0 1 1 1 
L= {I} -w = ¿Mb 0 1 0)]-{ 0 = ¿Mb 0 
00 1 0 0 


This shows that L and w are colinear and thus the x axis is a principal axis. By symmetry, the y and z 
body fixed axis also must be principal axes. 


326 CHAPTER 13. RIGID-BODY ROTATION 


Consider that the body is rotated about a diagonal of the cube for which the center of mass will be on 


1 
the rotation axis. Then the angular velocity vector is written as w =v- 1 where the components of 
1 
wa = tiy =y = w with the angular velocity magnitude ,/w? +w? +w? =w. 
1 0 0 1 1 
1 1 1 1 1 
L={I}-w==Mb’w—| 0 1 0 |-| 1 | =4M?w— | 1 | = 5M w 
E 001 py 8 NSN gp 8 


Note that L and w again are colinear showing it also is a principal axis. Moreover, the magnitude of L 
is identical for orientations of the rotation axes w passing through the center of mass when centered on 
either one face, or the diagonal, of the cube implying that the principal moments of inertia about these axes 
are identical. This illustrates the important property that, when the three principal moments of inertia are 
identical, then any orientation of the coordinate system is an equally good principal axis system. That is, 
this corresponds to the spherical top where all orientations are principal axes, not just along the obvious 
symmetry axes. 


13.6 Example: Rotation about the corner of the cube 


Let us repeat the above exercise for rotation about one corner of the cube. Consider that the angular 
velocity is along the x axis. Then example (13.2) gives the angular momentum to be 


1 +8 -3 -3 1 1 +8 
L= {I} w= Mb -3 +8 -3 |- | 0 = Mb w —3 
-3 -3 +8 0 -3 


The angular momentum is far from being aligned with the axis w, that is, it is not a principal axis. 
Consider that the body is rotated with the angular velocity aligned along a diagonal of the cube through 


1 
the center of mass on this axis. Then the angular velocity is written as w =F 1 | where the components 
1 
Of Wr = Wy = Wz = Z ensuring that the magnitude equals ,/w? + w? +w? = w. 
1 1 +8 -3 -3 1 1 1 2 1 
L = {I} w = —Mb*w 3 +48 -3 |- | 1 |) =—Mb?w—] 2 | = =Mb2w 
NS 5 A a ee p ee one S 


This is a principal axis since L and w again are colinear and the angular momentum is the same as for any 
axis through the center of mass of a uniform solid cube due to the high symmetry of the cube. If the angular 
velocity is perpendicular to the diagonal of the cube, then, for either of these perpendicular axes, the relation 
between L and w is given by 


1 1 +8 -3 -3 —1 r r 41 i =i 
L= —Mb?w— | -3 +8 -3 |- | +1 | = >M ?w— | +11 | = =Mbw]| +1 
12 V2\ -3 -3 +8 0 ta v2 \ 9 12 0 


Note that this must be a principal axis for rotation about a corner of the cube since L and w are colinear. 
The angular momentum is the same for both possible orientations of w that are perpendicular to the diagonal 
through the center of mass. Diagonalizing the inertia tensor in example 13.2 also gave the above result with 
the symmetry axis along the diagonal of the cube. 

This example illustrates that it is not necessary to diagonalize the inertia tensor matrix to obtain the 
principal axes. The corner of the cube has three mutually perpendicular principal axes independent of the 
choice of a body-fixed coordinate frame. The advantage of the principal axis coordinate frame is that the 
inertia tensor is diagonal making evaluation of the angular momentum trivial. That is, there is no physics 
associated with the orientation chosen for the body-fixed coordinate frame, this frame only determines the 
ratio of the components of the inertia tensor along the chosen coordinates. Note that, if a body has an obvious 
symmetry, then intuition is a powerful way to identify the principal axis frame. 
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13.12 Kinetic energy of rotating rigid body 


An important observable is the kinetic energy of rotation of a rigid body. Consider a rigid body composed 
of N particles of mass Mma where a = 1,2,3,...N. If the body rotates with an instantaneous angular velocity 
w about some fixed point, with respect to the body coordinate system, and this point has an instantaneous 
translational velocity V with respect to the fixed (inertial) coordinate system, see figure 13.1, then the 
instantaneous velocity va of the a*” particle in the fixed frame of reference is given by 


Va=Vtvatwxr, (13.56) 


However, for a rigid body, the velocity of a body-fixed point with respect to the body is zero, that is v” = 0, 
thus 


Va=V+wxr, (13.57) 
The total kinetic energy is given by 
<i Xa 
TARE 2, gaVa va = De ro xr): (V +w xri) 
= A ea Oe (w x rh) (wx ri) (13.58) 
2 z ; Se = ° > 4 


This is a general expression for the kinetic energy that is valid for any choice of the origin from which the 
body-fixed vectors r/, are measured. However, if the origin is chosen to be the center of mass, then, and only 
then, the middle term cancels. That is, since V - w is independent of the specific particle, then 


N N 
XO MV -w xr, =V wx (>: rat) (13.59) 


But the definition of the center of mass is 
So mar’ = MR (13.60) 
and R = 0 in the body-fixed frame if the selected point in the body is the center of mass. Thus, when using 


the center of mass frame, the middle term of equation 13.58 is zero. Therefore, for the center of mass frame, 
the kinetic energy separates into two terms in the body-fixed frame 


T= Tirana T Trot (13.61) 
where 
Y 
= 2 
Tirans — 2 Deel (13.62) 
ix 
Tee = om (w x rh): (w xri) 
The vector identity 
(A x B)- (A x B) = 42B? — (A - B)’ (13.63) 
can be used to simplify Trot 
N 
1 2h 152 
Trot = 5 Ma [w Ta — (wW: ra) (13.64) 


The rotational kinetic energy Trot can be expressed in terms of components of w and r”, in the body-fixed 
frame. Also the following formulae are greatly simplified if r, = (£a, Ya, Za) in the rotating body-fixed frame 
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is written in the form r} = (Za 1,Ta,2, Ta,3) where the axes are defined by the numbers 1,2,3 rather than 
x,y,z. In this notation the rotational kinetic energy is written as 


N 
1 
Trot = 5 Ma (Z4) (= “| = (Soi Nte (13.65) 
a i k i j 
Assume the Kronecker delta relation A 
wi = X wsi (13.66) 
j 


Then the kinetic energy can be written more compactly 


1 o 2 2 
Trot = 5 ma (==) (= za) — (Sei Y Uta 
1 N 3 3 
= pea floss (>: da) = (itai) wires) 


a 


1 3 N 3 
= 5 it Y Ma R da) = | (13.67) 
4,9 Q k 


The term in the outer square brackets is the inertia tensor defined in equation 13.12 for a discrete body. The 
inertia tensor components for a continuous body are given by equation 13.13. 
Thus the rotational component of the kinetic energy can be written in terms of the inertia tensor as 


Tse = Dee (13.68) 
J 
Note that when the inertia tensor is diagonal ,then the evaluation of the kinetic energy simplifies to 
Tas e (13.69) 
22 i 


which is the familiar relation in terms of the scalar moment of inertia J discussed in elementary mechanics. 
Equation 13.68 also can be factored in terms of the angular momentum L. 


Trot = SY ii = ¿DY los = ¿Dita (13.70) 
i,j i j i 


As mentioned earlier, tensor algebra is an elegant and compact way of expressing such matrix operations. 
Thus it is possible to express the rotational kinetic energy as 


1 In La hs Wy 
Trot = 5 (41 w2 ws )- la l2 l3 |-| %2 (13.71) 
Isı [32 133 w3 
1 
Tron = T= 50 {w (13.72) 


where the rotational energy T is a scalar. Using equation 13.55 the rotational component of the kinetic 
energy also can be written as 
1 

Trot = T = 5° L (13.73) 
which is the same as given by (13.70). It is interesting to realize that even though L = {I} - w is the inner 
product of a tensor and a vector, it is a vector as illustrated by the fact that the inner product Trot = Lw-L = 
iw - ({I} -w) is a scalar. Note that the translational kinetic energy Ttrans must be added to the rotational 
kinetic energy To to get the total kinetic energy as given by equation 13.61. 
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13.13 Euler angles 


The description of rigid-body rotation is greatly facil- 
itated by transforming from the space-fixed coordinate 
frame (X, Y, Z) to a rotating body-fixed coordinate frame 
(1, 2, 3) for which the inertia tensor is diagonal. Appen- 
dix D introduced the rotation matrix {A} which can be 
used to rotate between the space-fixed coordinate sys- 
tem, which is stationary, and the instantaneous body- 
fixed frame which is rotating with respect to the space- 
fixed frame. The transformation can be represented by 
a matrix equation 


(1,2,3) = {A}; & 9,2) (13.74) 


where the space-fixed system is identified by unit vectors 
(£, Y, 2) while (1, 2,3) defines unit vectors in the rotated 
body-fixed system. The rotation matrix {A} completely 
describes the instantaneous relative orientation of the 
two systems. Rigid-body rotation requires three inde- 
pendent angular parameters that specify the orientation 
of the rigid body such that the corresponding orthog- 
onal transformation matrix is proper, that is, it has a 
determinant |A| = +1 as given by equation (D.33). 

As discussed in Appendix D.2, the 9 component ro- 
tation matrix involves only three independent angles. 
There are many possible choices for these three angles. 
It is convenient to use the Euler angles, ¢, 0,2, (also 
called Eulerian angles) shown in figure 13.3.1 The Euler i f 
angles are generated by a series of three rotations that Ag, Ao, Ay corresponding to the Eulerian angles 
rotate from the space-fixed (X, ¥,Z) system to the body- (9, 0,1). The first rotation ġ about the space- 
fixed (1,2,3) system. The rotation must be such that fixed z axis (blue) is from the x-axis (blue) to the 
the space-fixed z axis rotates by an angle 0 to align with line of nodes n (green). The second rotation 0 
the body-fixed 3 axis. This can be performed by rotating about the line of nodes (green) is from the space- 


through an angle 9 about the ñ = Ê x 3 direction, where fixed 2 axis (blue) to the body-fixed 3-axis (red). 
“Ln The third rotation ~ about the body-fixed 3-axis 


2 and 3 designate the unit vectors along the “z” axes - A 
of the space and body fixed frames respectively. The (red) is from the line of nodes (green) to the body- 
fixed 1 axis (red). 


unit vector fi = 2 x 3 is the vector normal to the plane 
defined by the 2 and 3 unit vectors and this unit vector ñ = @ x 3 is called the line of nodes. The chosen 
convention is that the unit vector ñ = ĉ x 3 is along the “x” axis of an intermediate-axis frame designated 
by (A, 9,2), that is, the unit vector ñ = 2 x 3 plus the unit vectors $” and 2 are in the same plane as the 2 
and 3 unit vectors. The sequence of three rotations is performed as summarized below. 


Figure 13.3: The z — x — z sequence of rotations 


1) Rotation ¢ about the space-fixed Z axis from the space X axis to the line of nodes ñ: The 
first rotation (x, y, z) + A T (n, y’,z) is in a right-handed direction through an angle ¢ about the space-fixed 
z axis. Since the rotation takes place in the x — y plane, the transformation matrix is 


cos ġ sind 0 
{Ag}=| —sing cosp 0 (13.75) 
0 0 1 


1 The space-fixed coordinate frame and the body-fixed coordinate frames are unambiguously defined, that is, the space-fixed 
frame is stationary while the body-fixed frame is the principal-axis frame of the body. There are several possible intermediate 
frames that can be used to define the Euler angles. The z — x — z sequence of rotations, used here, is used in most physics 
textbooks in classical mechanics. Unfortunately scientists and engineers use slightly different conventions for defining the Euler 
angles. As discussed in Appendix A of "Classical Mechanics" by Goldstein, nuclear and particle physicists have adopted the 
z — y — z sequence of rotations while the US and UK aerodynamicists have adopted a x — y — z sequence of rotations. 
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This leads to the intermediate coordinate system (n, y’,z) where the rotated x axis now is colinear with the 
n axis of the intermediate frame, that is, the line of nodes. 


(n,y’,z) = {Ag}: (x,y, 2) (13.76) 


The precession angular velocity @ is the rate of change of angle of the line of nodes with respect to the space 
x axis about the space-fixed z axis. 


2) Rotation 0 about the line of nodes fi from the space Z axis to the body-fixed 3 axis: The 


second rotation 
(n, y”, Z) s Ag =x (n, y”, 3) (13.77) 


is in a right-handed direction through the angle @ about the ñ axis (line of nodes) so that the “2” axis becomes 
colinear with the body-fixed 3 axis. Because the rotation now is in the ĉ— 3 plane, the transformation matrix 
is 


1 0 0 
{Ap}= | 0 cosd © sind (13.78) 
0 —sinf cosé 
The line of nodes which is at the intersection of the space-fixed and body-fixed planes, shown in figure 13.3, 
points in the ñ = ĉ x 3 direction. The new “z” axis now is the body-fixed 3 axis. The angular velocity ô is 
the rate of change of angle of the body-fixed 3-axis relative to the space-fixed Z-axis about the line of nodes. 


3) Rotation 4 about the body-fixed 3 axis from the line of nodes to the body-fixed Î axis: The 
third rotation 

(n,y”,3)-Ay > (1, 2,3) (13.79) 
is in a right-handed direction through the angle y about the new body-fixed 3 axis. This third rotation 


transforms the rotated intermediate (n, y”,3) frame to final body-fixed coordinate system (1,2,3). The 
transformation matrix is 
cos Y sinw 0 
{Ays= | -siny cosy 0 (13.80) 
0 0 1 


The spin angular velocity 1) is the rate of change of the angle of the body-fixed 1-axis with respect to the 
line of nodes about the body-fixed 3 axis. 
The total rotation matrix {A} is given by 


{A} = {Aw} Do) {Ag} (13.81) 


Thus the complete rotation from the space-fixed (x,y,z) axis system to the body-fixed (1, 2,3) axis system 
is given by 
(1, 2, 3) = {A} > (x, Y, z) (13.82) 


where {A} is given by the triple product equation (13.81) leading to the rotation matrix 
cos ¢ cos y — sin ¢ cos 0 sin Y% sing cos y +cospcosOsiny  sin0Osiny 

{A} = | —cosósiny —sin¢cosécosy — sin siny +cos¢cos@cosy sin 0 cosy (13.83) 

sin ọsin 0 — cos p sin cos 0 

The inverse transformation from the body-fixed axis system to the space-fixed axis system is given by 
(x, y, z) = {A}? (1, 2, 3) (13.84) 
where the inverse matrix LA equals the transposed rotation matrix tA, that is, 
cos Hcos Y — sin ø cos Ô siny —cososiny —singcosOcosy  singsind 


{A} | ={A}" = | sindcosy + cosdcosOsiny —sinósiny +cosdcosPcosy —cos@sind (13.85) 
sin 0 sin y sin 0 cos y cos 0 
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Taking the product {A} {A}! = 1 shows that the rotation matrix is a proper, orthogonal, unit matrix. 

The use of three different coordinate systems, space-fixed, the intermediate line of nodes, and the body- 
fixed frame can be confusing at first glance. Basically the angle ¢ specifies the rotation about the space-fized 
z axis between the space-fired x axis and the line of nodes of the Euler angle intermediate frame. The angle 
w specifies the rotation about the body-fixed 3 axis between the line of nodes and the body-fized 1 axis. Note 
that although the space-fixed and body-fixed axes systems each are orthogonal, the Euler angle basis in 
general is not orthogonal. For rigid-body rotation the rotation angle ¢ about the space-fixed z axis is time 
dependent, that is, the line of nodes is rotating with an angular velocity @ with respect to the space-fixed 
coordinate frame. Similarly the body-fixed coordinate frame is rotating about the body-fixed 3 axis with 
angular velocity 7) relative to the line of nodes. 


13.7 Example: Euler angle transformation 


The definition of the Euler angles can be confusing, therefore it is useful to illustrate their use for a 
rotational transformation of a primed frame (x',y',z') to an unprimed frame (x,y,z). Assume the first 
rotation about the z’ axis, is ¢ = 30° 


Sonji 


0 
0 
1 


Let the second rotation be 0 = 45° about the line of nodes, that is, the intermediate x” axis. Then 


1 0 0 
Te. Salk 
EN a ae 
Oe a 


Let the third rotation be y = 90° about the z axis. 


0 10 
Ay={ -1 0 0 
0 01 


Thus the net rotation corresponds to A = AyAors 


g 100 1 0 0 0 10 =¿vV2 3v3 ¿v2 
[Lo jo Z a)i- 00)]=[-45 = be 
¿1 lo A 5) l0001J Aja 0 la 


13.14 Angular velocity w 


It is useful to relate the rigid-body equations of motion in the space-fixed (X,Y,Z) coordinate system to 
those in the body-fixed (€1, 62,@3) coordinate system where the principal axis inertia tensor is defined. It 
was shown in appendix D that an infinitessimal rotation can be represented by a vector. Thus the time 
derivatives of these rotation angles can be associated with the components of the angular velocity w, where 
the precession wg = 6, the nutation wọ = 6, and the spin wy = Y. Unfortunately the coordinates (¢, 9, Y) 
are with respect to mixed coordinate frames and thus are not orthogonal axes. That is, the Euler angular 
velocities are expressed in different coordinate frames, where the precession ¢ is around the space-fixed Z 
axis measured relative to the X-axis, the spin Y is around the body-fixed ê axis relative to the rotating 
line-of-nodes, and the nutation Ú is the angular velocity between the Z and êg axes and points along the 
instantaneous line-of-nodes in the ê x Z direction. By reference to figure 13.3 it can be seen that the 
components along the body-fixed axes are as given in Table 13.1. 


Table 13.1; Euler angular velocity components in the body-fixed frame 
Precession @ Nutation 0 Spin y | 

Pd = sinf siny | 01 =0cosy | y =0 

By = Gain d cos 


s = cos 0 63 =0 Va = Y 
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Note that the precession angular velocity 6 is the angular velocity that the body-fixed é3 and Z x 3 axes 
precess around the space-fixed Z axis. Table 13.1 gives the Euler angular velocities required to calculate 
the components of the angular velocity w for the body-fixed (1, 2,3) axis system. Collecting the individual 
components of w, gives the components of the angular velocity of the body, relative to the space-fixed axes, 
in the body-fixed axis system (1, 2,3) 


wr = 0,+01,+%, = ġsin b sin y + ô cosy (13.86) 
we = fa+0 +1 = ġsinð cosy — ô sin y (13.87) 
wz = d3+03+%3 = ġcosh +1 (13.88) 


The angular velocity of the body about the body-fixed 3-axis, w3, is the sum of the projection of the 
precession angular velocity of the line-of-nodes with respect to the space-fixed x-axis, plus the angular 
velocity 4 of the body-fixed 3-axis with respect to the rotating line-of-nodes. 

Similarly, the components of the body angular velocity w for the space-fixed axis system (x,y,z) can be 
derived to be 


wi = Ocosd+ysinésingd (13.89) 
wy = Osing—wsinOcos¢ (13.90) 
ws = $+ycosó (13.91) 


Note that when 0 = 0 then the Euler angles are singular in that the space-fixed z axis is parallel with 
the body-fixed 3 axis and there is no way of distinguishing between precession b and spin w, leading to 
wz =w3 = +4. When 0 = 7 then the z axis and 3 axis are antiparallel and w; =p — w = —w 3. The other 
special case is when cos@ = 0 for which the Euler angle system is orthogonal and the space-fixed w, = ¢, 
that is, it equals the precession, while the body-fixed w3 = Y, that is, it equals the spin. When the Euler 
angle basis is not orthogonal then equations (13.86 — 88) and (13.89 — 91) are needed for expressing the 
Euler equations of motion in either the body-fixed frame or the space-fixed frame respectively. 

Equations 13.86 — 88 for the components of the angular velocity in the body-fixed frame can be expressed 
in terms of the Euler angle velocities in a matrix form as 


wi sin0siny cosy 0 $ 
wa | =| sinĝcosy —siny 0 |-| 0 (13.92) 
w3 cos 0 0 1 w 


again note that the transformation matrix is not orthogonal which is to be expected since the Euler angular 
velocities are about axes that do not form a rectangular system of coordinates. Similarly equations 13.89— 91 
for the angular velocity in the space-fixed frame can be expressed in terms of the Euler angle velocities in 
matrix form as 


Wa 0 coso sindsing $ 
wy |=| 0 sind sinócoso |-| 0 (13.93) 
Wz 1 0 cos 0 Y 


13.15 Kinetic energy in terms of Euler angular velocities 


The kinetic energy is a scalar quantity and thus is the same in both stationary and rotating frames of 
reference. It is much easier to evaluate the kinetic energy in the rotating Principal-axis frame since the 
inertia tensor is diagonal in the Principal-axis frame as given in equation 13.69 


3 
1 2 
Trot = 3 > Lw; (13.94) 


Using equation 13.86 — 88 for the body-fixed angular velocities gives the rotational kinetic energy in terms 
of the Euler angular velocities and principal-frame moments of inertia to be 


1 i , 2 y : 2 : we 
Trot = 5 E (o sin O sin y + Ô cos 4) Hi (sin cosy — Ósin 4) + (coso $ b) (13.95) 
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13.16 Rotational invariants 


The scalar properties of a rotating body, such as mass M, Lagrangian L, and Hamiltonian H, are rotationally 
invariant, that is, they are the same in any body-fixed or laboratory-fixed coordinate frame. This fact also 
applies to scalar products of all vector observables such as angular momentum. For example the scalar 
product 

L- L=? (13.96) 


where l is the root mean square value of the angular momentum. An example of a scalar invariant is the 
scalar product of the angular velocity 
w: w =w? (13.97) 


where w? is the mean square angular velocity. The scalar product w : w = ||? can be calculated using the 
Euler-angle velocities for the body-fixed frame, equations 13.86 — 88, to be 


-2 .2 .2 e 
ww = |w =w tww =o +0 +14 +20 cos 


Similarly, the scalar product can be calculated using the Euler angle velocities for the space-fixed frame 
using equations 13.89 — 91. 


+2 +2 .2 oan 
ww = |w? =w? +w? +u = o +0 +4 +26 cos 0 


This shows the obvious result that the scalar product w-w = |w]? is invariant to rotations of the coordinate 
frame, that is, it is identical when evaluated in either the space-fixed, or body-fixed frames. 
Note that for 0 = 0, the 3 and 2 axes are parallel, and perpendicular to the 0 axis, then 


lo? = ($40) 40° 
For the case when 0 = 180°, the 3 and £ axes are antiparallel, and perpendicular to the 0 axis, then 
la? = (6-0) +0 
For the case when 6 = 90°, the 3 , 2, and 0 axes are mutually perpendicular, that is, orthogonal, and then 


l= +o +6 


The time-averaged shape of a rapidly-rotating body, as seen in the fixed inertial frame, is very different 
from the actual shape of the body, and this difference depends on the rotational frequency. For example, a 
pencil rotating rapidly about an axis perpendicular to the body-fixed symmetry axis has an average shape 
that is a flat disk in the laboratory frame which bears little resemblance to a pencil. The actual shape of the 
pencil could be determined by taking high-speed photographs which display the instantaneous body-fixed 
shape of the object at given times. Unfortunately for fast rotation, such as rotation of a molecule or a 
nucleus, it is not possible to take photographs with sufficient speed and spatial resolution to observe the 
instantaneous shape of the rotating body. What is measured is the average shape of the body as seen in the 
fixed laboratory frame. In principle the shape observed in the fixed inertial frame can be related to the shape 
in the body-fixed frame, but this requires knowing the body-fixed shape which in general is not known. For 
example, a deformed nucleus may be both vibrating and rotating about some triaxially deformed average 
shape which is a function of the rotational frequency. This is not apparent from the shapes measured in the 
fixed frame for each of the excited states. 

The fact that scalar products are rotationally invariant, provides a powerful means of transforming prod- 
ucts of observables in the body-fixed frame, to those in the laboratory frame. In 1971 Cline developed 
a powerful model-independent method that utilizes rotationally-invariant products of the electromagnetic 
quadrupole operator E2 to relate the electromagnetic E2 properties for the observed levels of a rotating 
nucleus measured in the laboratory frame, to the electromagnetic E2 properties of the deformed rotating 
nucleus measured in the body-fixed frame. [Cli71, Cli72, Cli86] The method uses the fact that scalar products 
of the electromagnetic multipole operators are rotationally invariant. This allows transforming scalar prod- 
ucts of a complete set of measured electromagnetic matrix elements, measured in the laboratory frame, into 
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the electromagnetic properties in the body-fixed frame of the rotating nucleus. These rotational invariants 
provide a model-independent determination of the magnitude, triaxiality, and vibrational amplitudes of the 
average shapes in the body-fixed frame for individual observed nuclear states that may be undergoing both 
rotation and vibration. When the bombarding energy is below the Coulomb barrier, the scattering of a 
projectile nucleus by a target nucleus is due purely to the electromagnetic interaction since the distance 
of closest approach exceeds the range of the nuclear force. For such pure Coulomb collisions, the electro- 
magnetic excitation of collective nuclei populates many excited states, as illustrated in figure 14.13, with 
cross sections that are a direct measure of the £2 matrix elements. These measured matrix elements are 
precisely those required to evaluate, in the laboratory frame, the £2 rotational invariants from which it is 
possible to deduce the intrinsic quadrupole shapes of the rotating-vibrating nuclear states in the body-fixed 
frame[Cli86]. 


13.17 Euler’s equations of motion for rigid-body rotation 


Rigid-body rotation can be confusing in that two coordinate frames are involved and, in general, the angular 
velocity and angular momentum are not aligned. The motion of the rigid body is observed in the space-fixed 
inertial frame whereas it is simpler to calculate the equations of motion in the body-fixed principal axis 
frame, for which the inertia tensor is known and is constant. The rigid body is rotating about the angular 
velocity vector w, which is not aligned with the angular momentum L. For torque-free motion, L is conserved 
and has a fixed orientation in the space-fixed axis system. Euler’s equations of motion, presented below, 
are given in the body-fixed frame for which the inertial tensor is known since this simplifies solution of the 
equations of motion. However, this solution has to be rotated back into the space-fixed frame to describe 
the rotational motion as seen by an observer in the inertial frame. 

This chapter has introduced the inertial properties of a rigid body, as well as the Euler angles for 
transforming between the body-fixed and inertial frames of reference. This has prepared the stage for 
solving the equations of motion for rigid-body motion, namely, the dynamics of rotational motion about a 
body-fixed point under the action of external forces. The Euler angles are used to specify the instantaneous 
orientation of the rigid body. 

In Newtonian mechanics, the rotational motion is governed by the equivalent Newton’s second law given 
in terms of the external torque N and angular momentum L 


L 
N= (=) (13.98) 
dt space 


Note that this relation is expressed in the inertial space-fixed frame of reference, not the non-inertial body- 
fixed frame. The subscript space is added to emphasize that this equation is written in the inertial space-fixed 
frame of reference. However, as already discussed, it is much more convenient to transform from the space- 
fixed inertial frame to the body-fixed frame for which the inertia tensor of the rigid body is known. Thus the 
next stage is to express the rotational motion in terms of the body-fixed frame of reference. For simplicity, 
translational motion will be ignored. 

The rate of change of angular momentum can be written in terms of the body-fixed value, using the 
transformation from the space-fixed inertial frame (X,y,Z) to the rotating frame (@1, 62,63) as given in 


chapter 10.3, 
dL dL 
N= (=) = (5) +wxL (13.99) 
dt space dt body 


However, the body axis ê; is chosen to be the principal axis such that 
Li = Liwi (13.100) 


where the principal moments of inertia are written as J;. Thus the equation of motion can be written using 
the body-fixed coordinate system as 


N = 10161 + low9€83 + [30383 + Wy wa W3 (13.101) 
lwi L2w9 L3w3 


= (hwy = (La = [3) w2w3) êi T (Low = (I = L) w3W1) és + (13w3 a (L =e Ip) ww) €3(13.102) 
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where the components in the body-fixed axes are given by 
Ny = lwi = (La = Is) Waz (13.103) 
Na 1207 — (l3 — 11) ww 
N3 = 1303 == (1, — Iz) wiwe 


These are the Euler equations for rigid body in a force field expressed in the body-fixed coordinate 
frame. They are applicable for any applied external torque N. 

The motion of a rigid body depends on the structure of the body only via the three principal moments 
of inertia Jı T2, and 13. Thus all bodies having the same principal moments of inertia will behave exactly the 
same even though the bodies may have very different shapes. As discussed earlier, the simplest geometrical 
shape of a body having three different principal moments is a homogeneous ellipsoid. Thus, the rigid-body 
motion often is described in terms of the equivalent ellipsoid that has the same principal moments. 

A deficiency of Euler’s equations is that the solutions yield the time variation of w as seen from the body- 
fixed reference frame axes, and not in the observers fixed inertial coordinate frame. Similarly the components 
of the external torques in the Euler equations are given with respect to the body-fixed axis system which 
implies that the orientation of the body is already known. Thus for non-zero external torques the problem 
cannot be solved until the the orientation is known in order to determine the components N£**. However, 
these difficulties disappear when the external torques are zero, or if the motion of the body is known and it 
is required to compute the applied torques necessary to produce such motion. 


13.18 Lagrange equations of motion for rigid-body rotation 


The Euler equations of motion were derived using Newtonian concepts of torque and angular momentum. 
It is of interest to derive the equations of motion using Lagrangian mechanics. It is convenient to use a 
generalized torque N and assume that U = 0 in the Lagrange-Euler equations. Note that the generalized 
force is a torque since the corresponding generalized coordinate is an angle, and the conjugate momentum 
is angular momentum. If the body-fixed frame of reference is chosen to be the principal axes system, then, 
since the inertia tensor is diagonal in the principal axis frame, the kinetic energy is given in terms of the 
principal moments of inertia as 


1 2 
Tar 2 Iw (13.104) 
Using the Euler angles as generalized coordinates, then the Lagrange equation for the specific case of the y 
coordinate and including a generalized force Ny gives 


dar ƏT 
L SL iy 13.1 
Rap oe Ne (13.105) 


which can be expressed as 


d Š ôT ðw; Š T On; 
keins = — =N, 13.1 
dt + Ou; Ay 2 Bay ay” (15100 
Equation 13.104 gives 
L N (13.107) 
ðw, : 


Differentiating the angular velocity components in the body-fixed frame, equations (13.86 — 13.88) , give 


SF = sin é cosy — Osin y = we = 
| ae = —ġsin sin Y — 0 cos Y = —w1 ae = du = | 
| oe = 0 Be = el 
Substituting these into the Lagrange equation (13.106) gives 
d 
—lÍ3w3 — Iywywe + Low (=w1) = N3 (13.108) 


dt 
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since the Y and é axes are colinear. This can be rewritten as 
Iw — (11 — In) wiw = N3 (13.109) 


Any axis could have been designated the éz axis, thus the above equation can be generalized to all three 
axes to give 


Thay — (l2— Is) wows = M (13.110) 
hw- (l3 — Ih) w3u1 = Na 
1303 == (L = Ip) wwe = N3 


These are the Euler’s equations given previously in (13.103). Note that although ùg is the equation 
of motion for the y coordinate, this is not true for the @ and @ rotations which are not along the body-fixed 
zı and x axes as given in table 13.1. 


13.8 Example: Rotation of a dumbbell 


Consider the motion of the symmetric dumbbell shown in the adjacent figure. Let |rı| = |ra] = b. Let the 
body-fized coordinate system have its origin at O and symmetry axis éz be along the weightless shaft toward 
mı and Va = Vai. The angular momentum is given by 


L= j Miri X Vi 
2 


Because L is perpendicular to the shaft, and L rotates around w as the shaft rotates, let é2 be along L. 
L = L262 


If a is the angle between w and the shaft, the components of w 
are 


W1 = 0 
w = wsina 
W3 = Wwcosa 


Assume that the principal moments of the dumbbell are 


qT, = (mi Tt ma) b? 
Ip = (mı T ma) b? 
Tz an 0 


Thus the angular momentum is given by 


Li = lwi =0 
Ly = [nw = (m +m) b’wsina Rotation of a dumbbell. 
L3 = l3w3 =0 


which is consistent with the angular momentum being along the é2 axis. 
Using Euler’s equations, and assuming that the angular velocity is constant, i.e. w = 0, then the compo- 
nents of the torque required to satisfy this motion are 


N, = —(mi +m)b%w? sinacosa 
No = 0 
Ns; = 0 


That is, this motion can only occur in the presence of the above applied torque which is in the direction 
—é], that is, mutually perpendicular to é& and é . This torque can be written as N = w x L. 
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13.19 Hamiltonian equations of motion for rigid-body rotation 


The Hamiltonian equations of motion are expressed in terms of the Euler angles plus their corresponding 
canonical angular momenta (¢, 0, Y, py, Po, Py) in contrast to Lagrangian mechanics which is based on the 
Euler angles plus their corresponding angular velocities (¢, 0, 4%, Q, 0, Y). The Hamiltonian approach is con- 
veniently expressed in terms of a set of Andoyer-Deprit action-angle coordinates that include the three Euler 
angles, specifying the orientation of the body-fixed frame, plus the corresponding three angles specifying the 
orientation of the spin frame of reference. This phase space approach|Dep67] can be employed for calcu- 
lations of rotational motion in celestial mechanics that can include spin-orbit coupling. This Hamiltonian 
approach is beyond the scope of the present textbook. 


13.20 Torque-free rotation of an inertially-symmetric rigid rotor 


13.20.1 Euler’s equations of motion: 


There are many situations where one has rigid-body motion free 
of external torques, that is, N = 0. The tumbling motion of a 
jugglers baton, a diver, a rotating galaxy, or a frisbee, are exam- 
ples of rigid-body rotation. For torque-free rotation, the body 
will rotate about the center of mass, and thus the inertia tensor 
with respect to the center of mass is required. An inertially- 
symmetric rigid body has two identical principal moments of 
inertia with Jı = Ig Æ Is, and provides a simple example that 
illustrates the underlying motion. The force-free Euler equations 
for the symmetric body in the body-fixed principal axis system 
are given by 


(La — Is) Wa2W3 — Tw, = 0 (13.111) 
(I — I) (W3awj] — lwz = 0 (13.112) 
Inw3 = 0 (13.113) 


where Iı = Iz and N = 0 apply. 

Note that for torque-free motion of an inertially symmetric 
body equation 13.113 implies that w3 = 0, i.e. w3 is a constant 
of motion and thus is a cyclic variable for the symmetric rigid 


body. . . ; 
Equations 13.111 and 13.112 can be written as two coupled lee ee 
Enanos trajectory about the body-fixed symme- 
wtw: = 0 (13.114) try axis 3. 
wa = Qui] = 0 (13.115) 


where the precession angular velocity Q =1) with respect to the body-fixed frame is defined to be 


I3 — I 
eS (EL) (13.116) 
qh 
Combining the time derivatives of equations 13.114 and 13.115 leads to two uncoupled equations 
i+w = 0 (13.117) 
Wy +w = 0 (13.118) 


These are the differential equations for a harmonic oscillator with solutions 


Wy Acos Ot (13.119) 
We = Asin Ot 
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These equations describe a vector A rotating in a circle of radius A about an axis perpendicular to é3, that 
is, rotating in the é; — é2 plane with angular frequency Q = —w. Note that 


wi + we = A? (13.120) 


which is a constant. In addition w3 is constant, therefore the magnitude of the total angular velocity 


lwl = 4/ w? +w3 + w = constant (13.121) 


The motion of the torque-free symmetric body is that the angular velocity w precesses around the 
symmetry axis êg of the body at an angle a with a constant precession frequency Q with respect to the 
body-fixed frame as shown in figure 13.4. Thus, to an observer on the body, w traces out a cone around the 
body-fixed symmetry axis. Note from (13.116) that the vectors Né3 and w3é3 are parallel when Q is positive, 
that is, [3 > I (oblate shape) and antiparallel if Iz < I (prolate shape). 

For the system considered, the orientation of the angular momentum vector L must be stationary in the 
space-fixed inertial frame since the system is torque free, that is, L is a constant of motion. Also we have 
that the projection of the angular momentum on the body-fixed symmetry axis is a constant of motion, that 


is, it is a cyclic variable. Thus 
IIs 
L3 = I3w3 = ———- 13.122 
0) ma 
Understanding the relation between the angular momentum and angular velocity is facilitated by consid- 
ering another constant of motion for the torque-free symmetric rotor, namely the rotational kinetic energy. 


1 
Tre = z” - L = constant (13.123) 


Since L is a constant for torque-free motion, and also the magnitude of w was shown to be constant, therefore 
the angle between these two vectors must be a constant to ensure that also Trot = Lw - L = constant. That 
is, w precesses around L at a constant angle (9 — a) such that the projection of w onto L is constant. Note 
that 

wx é3 = Wy] = weg (13.124) 


and, for a symmetric rotor, 
L-wx éz = TwywWe = Tow We =0 (13.125) 


since J; = I, for the symmetric rotor. Because L -w x é = 0 for a symmetric top then L,w and é are 
coplanar. 

Figure 13.5 shows the geometry of the motion for both oblate and prolate axially-deformed bodies. To 
an observer in the space-fixed inertial frame, the angular velocity w traces out a cone that precesses with 
angular velocity (2 around the space fixed L axis called the space cone. For convenience, figure 13.5 assumes 
that L and the space-fixed inertial frame % axis are colinear. The angular velocity w also traces out the 
body cone as it precesses about the body-fixed 83 axis. Since L,w and é} are coplanar, then the w vector is 
at the intersection of the space and body cones as the body cone rolls around the space cone. That is, the 
space and body cones have one generatrix in common which coincides with w. As shown in figure 13.5b, for 
a needle the body cone appears to roll without slipping on the outside of the space cone at the precessional 
velocity of Q = —w. By contrast, as shown in figure 13.5a for an oblate (disc-shaped) symmetric top the 
space cone rolls inside the body cone and the precession (2 is faster than w. 

Since no external torques are acting for torque-free motion, then the magnitude and direction of the total 
angular momentum are conserved. The description of the motion is simplified if L is taken to be along the 
space-fixed Z axis, then the Euler angle 0 is the angle between the body-fixed basis vector 63 and space-fixed 
basis vector Z. If at some instant in the body frame, it is assumed that é is aligned in the plane of L,w 
and éz, then 

Li¡=0 Ly = Lsin0 L3 = Lcosé (13.126) 


If a is the angle between the angular velocity w and the body-fixed 63 axis, then at the same instant 


w=0 we =wsino W3 = W COS Q (13.127) 
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Space cone 


Body cone 


(a) (b) 


Figure 13.5: Torque-free rotation of symmetric tops; (a) circular flat disk, (b) circular rod. The space-fixed 
and body-fixed cones are shown by fine lines. The space-fixed axis system is designated by the unit vectors 
(X, Y, Z) and the body-fixed principal axis system by unit vectors (1, 2,3). 


The components of the angular momentum also can be derived from L = I - w to give 
Li = hwi =0 Lo = [gw = Ihwsina L3 = [3w3 = [3w cosa (13.128) 


Equations 13.126 and 13.128 give two relations for the ratio 2, that is, 


I 
2 — tang = — tana (13.129) 


For a prolate spheroid Iı > Iz therefore 0 > a while Q and w3 have opposite signs. 
For a oblate spheroid I, < I3 therefore a > 0 while Q and w3 have the same sign. 

The sense of precession can be understood if the body cone rolls without slipping on the outside of the 
space cone with 2 in the opposite orientation to w for the prolate case, while for the oblate case the space 
cone rolls inside the body cone with Q and w oriented in similar directions. Note from (13.129) that 9 = 0 
if a = 0, that is L,w and the 3 axis are aligned corresponding to a principal axis. Similarly, 9 = 90° if 
a = 90°, then again L and w are aligned corresponding to them being principal axes. 

Lagrangian mechanics has been used to calculate the motion with respect to the body-fixed principal 
axis system. However, the motion needs to be known relative to the space-fixed inertial frame where the 
motion is observed. This transformation can be done using the following relation 


dé: dé: 
(=) = (2) +w x ês =w x és (13.130) 
dt space dt body 


since the unit vector 63 is stationary in the body-fixed frame. The vector product of w x êg and 63 gives 


A dês R A Ae doe K À A 
83 x = 63 X w X 63 = (83 - 63) w — (83 - w) 63 = w — W363 
space 


dt 
therefore 
A dês A 
w = ê x | — + w363 (13.131) 
dt space 
The angular momentum equals L = {I} w. Since é3 x (e ae is perpendicular to the ê axis, then 


for the case with J1 = la, 


dê- 
L =1,63 x (=) + bwzês (13.132) 
space 
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Thus the angular momentum for a torque-free symmetric rigid rotor comprises two components, one being 
the perpendicular component that precesses around êg, and the other is Lg. 

In the space-fixed frame assume that the Z axis is colinear with L. Then taking the scalar product of ês 
and L, using equation 13.126 gives 


a 


dé: 
| a E ee E, ES ts ê (13.133) 
space 


The first term on the right is zero and thus equation 13.133 and 13.126 give 
Ls = L3w3 = Lcos (13.134) 


The time dependence of the rotation of the body-fixed symmetry axis with respect to the space-fixed 
axis system can be obtained by taking the vector product ê x L using equation 13.132 and using equation 
B.24 to expand the triple vector product, 


égxL = [ês x (a x (=) + I3w363 x 63 (13.135) 
space 
dé: dé 
= ne (2 és — (63-87) ( 2 +0 
dt space dt space 


dé3 ) 
dt / space 


dês L 
== ==> 13.1 
( dt i. qi A E í de 


This equation shows that the body-fixed symmetry axis 63 precesses around the L, where L is a constant 
of motion for torque-free rotation. The true rotational angular velocity w in the space-fixed frame, given by 
equations 13.131, can be evaluated using equation 13.136. Remembering that it was assumed that L is in 
the Z direction, that is, L =£Z, then 


since (€3 x 63) = 0. Moreover (ês - 83) = 1, and êg- ( = 0, since they are perpendicular, then 


L L-I; 

= %2 + Lcosa | ==] és (13.137) 
A I; 

That is, the symmetry axis of the axially-symmetric rigid rotor makes an angle 0 to the angular momentum 
vector LZ and precesses around LÊ with a constant angular velocity A while the axial spin of the rigid body 
has a constant value re Thus, in the precessing frame, the rigid body appears to rotate about its fixed 


l-I; 
l I3 


symmetry axis looks like a wobble superimposed on the spinning motion about the body-fixed symmetry 
axis. The angular precession rate in the space-fixed frame can be deduced by using the fact that 


symmetry axis with a constant angular velocity 


Lcosa _ Lcosa 
I3 


= Lcosa ( ). The precession of the 


ósinó = wsina (13.138) 


Then using equation 13.129 allows equation 13.138 to be written as 


(13.139) 


which gives the precession rate about the space-fixed axis in terms of the angular velocity w. Note that the 
precession rate @ > w if P > 1, that is, for oblate shapes, and $ <w if P < 1, that is, for prolate shapes. 
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13.20.2 Lagrange equations of motion: 


It is interesting to compare the equations of motion for torque-free rotation of an inertially-symmetric 
rigid rotor derived using Lagrange mechanics with that derived previously using Euler’s equations based on 
Newtonian mechanics. Assume that the principal moments about the fixed point of the symmetric top are 
I, = Ig HA Íg and that the kinetic energy equals the rotational kinetic energy, that is, it is assumed that the 
translational kinetic energy Tirans = 0. Then the kinetic energy is given by 


la 5 D la? = ah ( (wp +w3) +5 isu} (13.140) 


Equations (13.86 — 88) for the body-fixed frame give 


S a 2 F g E 
= (o sin O sin y + ô cosy) = ġ sin? Osin? y + 200 sin O sin Y cos Y + 0” cos? y (13.141) 
A E 2 7 y A 
= (o sin O cos y — 0 sin 4) = b sin? O cos? y) — 206 sin O sin Y cos Y + 6° sin? Y (13.142) 
Therefore 3 5 
w? +w =e sin? 0+0 (13.143) 
and 
à .1 2 
= (6c0s0 +0) (13.144) 


Therefore the kinetic energy is 
1 12.9 2 1 A A2 
T= 3h (o sin“ 0 +0 ) + 7h (ócos0 ++) (13.145) 


Since the system is torque free, the scalar potential energy U can be assumed to be zero, and then the 
Lagrangian equals 


1 ¿ , 1 ; y 2 
L=>h (9 sin? 0 + 9) + 5h (è cos 6 + ù) (13.146) 
The angular momentum about the space-fixed z axis py is conjugate to ¢. From Lagrange’s equations 
OL 
Do = 96 =0 (13.147) 
that is, the angular momentum about the space-fixed z axis, py is a constant of motion given by 
OL 
Po = a% = (Á sin? 6 + I cos 29) $ + Iz cos @ = constant. (13.148) 
Similarly, the angular momentum about the body-fixed 3 axis is conjugate to w. From Lagrange's equations, 
OL 
y = — =0 13.149 
Po = 55 ( ) 
that is, py is a constant of motion given by 
ðL A a 
Py = Op = Í; | dcos8+ wy) = Igw3 = constant (13.150) 


The above two relations derived from the Lagrangian can be solved to give the precession angular velocity 
@ about the space-fixed Z axis 


Po — Py COSO 
= —_——=—— 13.151 
I, sin? 6 ( ) 
and the spin about the body-fixed 3 axis w which is given by 
; = 6 9 
¡ya Pé Wa =e cos) con? (13.152) 


L3 I sin? 6 
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Since pg and py are constants of motion, then the precessional angular velocity @ about the space-fixed 2 
axis, and the spin angular velocity Y, which is the spin frequency about the body-fixed 3 axis, are constants 
that depend directly on f;, 13. and 0. 

There is one additional constant of motion available if no dissipative forces act on the system, that is, 
energy conservation which implies that the total energy 


E= Zn (5 sin? @ +07) + 5 (deoso +1) (13.153) 


will be a constant of motion. But the second term on the right-hand side also is a constant of motion since 
py and Iz both are constants, that is 


1 s .y 2 Pi 
Iw = 5/3 (cos 4 + 4) = — = constant (13.154) 


1 
2 E 


Thus energy conservation implies that the first term on the right-hand side also must be a constant given by 


1 1, f2. ; Pi 
3h (wi + w3) = 3h (o sin? 6 + 6°) =E- > = constant (13.155) 

3 
These results are identical to those given in equations 13.120 and 13.121 which were derived using Euler’s 
equations. These results illustrate that the underlying physics of the torque-free rigid rotor is more easily 
extracted using Lagrangian mechanics rather than using the Euler-angle approach of Newtonian mechanics. 


13.9 Example: Precession rate for torque-free rotating symmetric rigid rotor 


Table 13.2 lists the precession and spin angular velocities, in the space-fixed frame, for torque-free rotation 
of three extreme symmetric-top geometries spinning with constant angular momentum w when the motion 
is slightly perturbed such that w is at a small angle a to the symmetry axis. Note that this assumes the 
perpendicular axis theorem, equation 13.45 which states that for a thin laminae I, + Ig = I3 giving, for a 
thin circular disk, I, = Ig and thus I3 = 21;. 


Table 13.2: Precession and spin rates for torque-free axial rotation of symmetric rigid rotors 


Rigid-body symmetric shape | Principal moment ratio P Precession rate $ Spin rate w 


Symmetric needle 0 
Sphere 1 
Thin circular disk 2 


The precession angular velocity in the space frame ranges between 0 to 2w depending on whether the 
body-fixed spin angular velocity is aligned or anti-aligned with the rotational frequency w. For an extreme 
prolate spheroid P = 0, the body-fized spin angular velocity Q = —w3 which cancels the angular velocity 


w of the rotating frame resulting in a zero precession angular velocity of the body-fixed €z axis around the 


space-fixed frame. The spin Q = 0 in the body-fixed frame for the rigid sphere P = 1, and thus the precession 


rate of the body-fixed ês axis of the sphere around the space-fixed frame equals w. For oblate spheroids and 


thin disks, such as a frisbee, P = 2 making the body-fixed precession angular velocity Q = +w which adds 


to the angular velocity w and increases the precession rate up to 2w as seen in the space-fixed frame. This 
illustrates that the spin angular velocity can add constructively or destructively with the angular velocity w.? 


In his autobiography Surely You're Joking Mr Feynman, he wrote " I was in the [Cornell] cafeteria and some guy, fooling 
around, throws a plate in the air. As the plate went up in the air I saw it wobble, and noticed that the red medallion of 
Cornell on the plate going around. It was pretty obvious to me that the medallion went around faster than the wobbling. I 
started to figure out the motion of the rotating plate. I discovered that when the angle is very slight, the medallion rotates 
twice as fast as the wobble rate. It came out of a very complicated equation! ". The quoted ratio (2 : 1) is incorrect, it should 
be (1 : 2). Benjamin Chao in Physics Today of February 1989 speculated that Feynman’s error in inverting the factor of 
two might be "in keeping with the spirit of the author and the book, another practical joke meant for those who do physics 
without experimenting". He pointed out that this story occurred on page 157 of a book of length 314 pages (1:2). Observe the 
dependence of the ratio of wobble to rotation angular velocities on the tilt angle 6. 
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13.21 Torque-free rotation of an asymmetric rigid rotor 


The Euler equations of motion for the case of torque-free rotation of an asymmetric (triaxial) rigid rotor 
about the center of mass, with principal moments of inertia I; 4 Ig 4 Is, lead to more complicated motion 
than for the symmetric rigid rotor.? The general features of the motion of the asymmetric rotor can be 
deduced using the conservation of angular momentum and rotational kinetic energy. 

Assuming that the external torques are zero then the Euler 
equations of motion can be written as 


L; 

lwi = (La = I3) Waz (13.156) 

lwo = (Is = Ty) w3w 

Bws = (1, — In) wwe 

Since L; = I,w; for i = 1,2,3, then equation 13.156 gives L, 

IpIgiy = (h-I) LoLs (13.157) 
Lala = (Ig3—h)L3L1 
hbk; = (1,-L)LiLo L, 


Multiply the first equation by Jı L1, the second by I2L2 and the 
third by /3L3 and sum, which gives 


LDI (ith ig ck Laba) =0 (13.158) 


The bracket is equivalent to -4 (L? + L3 + L3) = 0 which implies 
that the total rotational angular momentum L is a constant of 
motion as expected for this torque-free system, even though the 
individual components Lı, La, Lg may vary. That is 


Figure 13.6: Rotation of an asymmetric 

rigid rotor. The dark lines correspond to 

contours of constant total rotational ki- 

netic energy T, which has an ellipsoidal 
12412412 = 1? (13.159) shape, projected onto the angular momen- 

tum L sphere in the body-fixed frame. 
Note that equation 13.159 is the equation of a sphere of radius L. 
Multiply the first equation of 13.157 by Lı, the second by L2, and the third by L3, and sum gives 


DELL + Hi L3LoLo + Hi LoL3Lg =0 (13.160) 


esi o by 1, 1213 gives an + i -—- a = 0. This implies that the total rotational kinetic energy 
, given by 


Lone ELS 
2H 2b 213 
is a constant of motion as expected when there are no external torques and zero energy dissipation. Note 
that 13.161 ¿s the equation of an ellipsoid. 

Equations 13.159 and 13.161 both must be satisfied by the rotational motion for any value of the total 
angular momentum L and kinetic energy T. Fig 13.6 shows a graphical representation of the intersection of 
the L sphere and T ellipsoid as seen in the body-fixed frame. The angular momentum vector L must follow 
the constant-energy contours given by where the T-ellipsoids intersect the L-sphere, shown for the case where 
Is > Ig > Ty. Note that the precession of the angular momentum vector L follows a trajectory that has 
closed paths that circle around the principal axis with the smallest J, that is, €,, or the principal axis with 
the maximum J, that is, 63. However, the angular momentum vector does not have a stable minimum for 
precession around the intermediate principal moment of inertia axis 62. In addition to the precession, the 
angular momentum vector L executes nutation, that is a nodding of the angle 0. 

For any fixed value of L, the kinetic energy has upper and lower bounds given by 


2 2 
Pepe ke 
23 — 72 


T (13.161) 


(13.162) 


3 Similar discussions of the freely-rotating asymmetric top are given by Landau and Lifshitz [La60] and by Gregory [Gr06]. 
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Thus, for a given value of L, when T = Tmin = 5 2% , the orientation of L in the body-fixed frame is either 
(0,0, +L) or (0,0, —L), that is, aligned with the ê axis along which the principal moment of inertia is largest. 
For slightly higher ees energy the trajectory of L follows closed paths precessing around éz. When the 
kinetic energy T = 52 the angular momentum vector L follows either of the two thin-line trajectories each 
of which are a P These do not have closed orbits around €2 and they separate the closed solutions 
around either 63 or êi. For higher kinetic energy the precessing angular momentum vector follows closed 
trajectories around ê and becomes fully aligned with €, at the upper-bound kinetic energy. 

Note that for the special case when J3 > Iz = I, then the asymmetric rigid rotor equals the symmetric 
rigid rotor for which the solutions of Euler’s equations were solved exactly in chapter 13.19. For the symmetric 
rigid rotor the T-ellipsoid becomes a spheroid aligned with the symmetry axis and thus the intersections 
with the L-sphere lead to circular paths around the ê body-fixed principal axis, while the separatrix circles 
the equator corresponding to the ê axis separating clockwise and anticlockwise precession about L3. This 
discussion shows that energy, plus angular momentum conservation, provide the general features of the 
solution for the torque-free symmetric top that are in agreement with those derived using Euler’s equations 
of motion 


13.22 Stability of torque-free rotation of an asymmetric body 


It is of interest to extend the prior discussion to address the stability of an asymmetric rigid rotor undergoing 
force-free rotation close to a principal axes, that is, when subject to small perturbations. Consider the case 
of a general asymmetric rigid body with Iz > Ig > I. Let the system start with rotation about the €; axis, 
that is, the principal axis associated with the moment of inertia /¡. Then 


Ww =w1€1 (13.163) 
Consider that a small perturbation is applied causing the angular velocity vector to be 
w =w1€1 + Neg + Lês (13.164) 


where A, u are very small. The Euler equations (13.156) become 


ee HWI — = 0 
(Iı 


Assuming that the product Ap in the first equation is negligible, then w1 = 0, that is, wı is constant. 
The other two equations can be solved to give 


( 

(La — Is) Ap — E = 0 
) 
In) 


wyrA — 2 = 0 


. [ees 
À= (Ea) M (13.165) 
Ty 
L-I 
u= (Ea) À (13.166) 
Ts 
Take the time derivative of the first equation 
A Li 
d= (Ea) ju (13.167) 
I, 
and substitute for ù gives 
x I, — 13) ( - I 
jq (Sh 2) y2 A=0 (13.168) 
Lal3 
The solution of this equation is 
AGE) = Aetat e (13.169) 


where 


ere CEng (13.170) 
IIs 
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Note that since it was assumed that I3 > Ig > J4, then (21, is real. The solution for A(t) therefore represents a 
stable oscillatory motion with precession frequency (01). The identical result is obtained for Qip = Qia = 1. 
Thus the motion corresponds to a stable minimum about the €, axis with oscillations about the À = u = 0 
minimum with period. 


(E, — Is) (h — h) 


Qı = Wi Tal; 


(13.171) 


Permuting the indices gives that for perturbations applied to rotation about either the 2 or 3 axes give 
precession frequencies 


Q = mE Zh) (a = Bs) (13.172) 
LI 

TE 0 E (13.173) 
TE 


Since Ig > Ip > J, then Qı and Qz are real while 2 is imaginary. Thus, whereas rotation about either 
the I or the I axes are stable, the imaginary solution about êz corresponds to a perturbation increasing 
with time. Thus, only rotation about the largest or smallest moments of inertia are stable. Moreover for 
the symmetric rigid rotor, with I, = Ig 4 I3, stability exists only about the symmetry axis 63 independent 
on whether the body is prolate or oblate. This result was implied from the discussion of energy and angular 
momentum conservation in chapter 13.20. Friction was not included in the above discussion. In the presence 
of dissipative forces, such as friction or drag, only rotation about the principal axis corresponding to the 
maximum moment of inertia is stable. 

Stability of rigid-body rotation has broad applications to rotation of satellites, molecules and nuclei. 
The first U.S. satellite, Explorer 1, was launched in 1958 with the rotation axis aligned with the cylindrical 
axis which was the minimum principal moment of inertia. After a few hours the satellite started tumbling 
with increasing amplitude due to a flexible antenna dissipating and transferring energy to the perpendicular 
axis which had the largest moment of inertia. Torque-free motion of a deformed rigid body is a ubiquitous 
phenomena in many branches of science, engineering, and sports as illustrated by the following examples. 


13.10 Example: Tennis racquet dynamics 


A tennis racquet is an asymmetric body that exhibits the above rota- 
tional behavior. Assume that the head of a tennis racquet is a uniform 
thin circular disk of radius R and mass M which is attached to a cylin- 
drical handle of diameter r = R, length 2R, and mass M as shown in M 
the figure. The principle moments of inertia about the three axes through 
the center-of-mass can be calculated by addition of the moments for the 


circular disk and the cylindrical handle and using both the parallel-axis 


and the perpendicular-axis theorems. 1 
Axis Head Handle Racquet M | 
1 IMR4+MRPZIMR 4MR? IMR 2R 
2 MURO ¿MR ¿MR? ŽŽ MR? | 
3 ¿MR+MRÍS¿MR? 4M EMR 


Principal rotation axes for the 
Note that I11 : loz : 33 = 2.5833 : 0.2550 : 2.8333. Inserting these center of mass of a tennis racket. 


principle moments of inertia into equations 13.171 — 13.173 gives the The 1 and 2 -axes are in the 
following precession frequencies plane of the racket head and the 
3 axis is perpendicular to the 
Q= 10.8976w: NRə= 0.9056 we 23= 0.9892w3 plane of the racket head. 


The imaginary precession frequency Qı about the 1 axis implies unstable rotation leading to tumbling 
whereas the minimum moment Is and maximum moment I33 imply stable rotation about the 2 and 3 azes. 
This rotational behavior is easily demonstrated by throwing a tennis racquet and is called the tennis racquet 
theorem. The center of percussion, example 2.14, is another important inertial property of a tennis racquet. 
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13.11 Example: Rotation of asymmetrically-deformed nuclei 


Some nuclei and molecules have average shapes that have significant asymmetric deformation leading to 
interesting quantal analogs of the rotational properties of an asymmetrically-deformed rigid body. The major 
difference between a quantal and a classical rotor is that the energies, and angular momentum are quantized, 
rather than being continuously variable quantities. Otherwise, the quantal rotors exhibit general features 
similar to the classical analog. Studies [Cli86] of the rotational behavior of asymmetrically-deformed nuclei 
exploit three aspects of classical mechanics, namely classical Coulomb trajectories, rotational invariants, and 
the properties of ellipsoidal rigid-bodies. 

Ellipsoidal deformation can be specified by the dimensions along each of the three principle axes. Bohr 
and Mottelson parameterized the ellipsoidal deformation in terms of three parameters, Ro which is the radius 
of the equivalent sphere, B which is a measure of the magnitude of the ellipsoidal deformation from the sphere, 
and y which specifies the deviation of the shape from axial symmetry. The ellipsoidal intrinsic shape can be 
expressed in terms of the deviation from the equivalent sphere by the equation 


42 


5R(0,d) = R(9,4) — Ro = Ro Y. 05, Yay(0, 6) (a) 


p=-2 


where Yy, (0, Q) is a Laplace spherical harmonic defined as 


2 DA-p)! : 
and P),,(cos@) is an associated Legendre function of cos0. Spherical harmonics are the angular portion of a 
set of solutions to Laplace’s equation. Represented in a system of spherical coordinates, Laplace’s spherical 
harmonics Y),,(0,¢) are a specific set of spherical harmonics that form an orthogonal system. Spherical 
harmonics are important in many theoretical and practical applications. 

In the principal axis frame of the body, there are three non-zero quadrupole deformation parameters 
which can be written in terms of the deformation parameters 6, where a29 = B cosy, a21 = a2-1 = 0, and 
009 = A2_2 = 758 sin y. Using these in equations (a) give the three semi-axis dimensions in the principal 


E — ZE) (b) 


Note that for y = 0, then OR, = R2 = -3 É Rob while ÍR3 = +4/ =RoB, that is the body has prolate 
deformation with the symmetry axis along the 3 axis. The same prolate shape is obtained for y = 2 and 
y= a with the prolate symmetry axes along the 1 and 2 axes respectively. For y = % then dR, = R3 = 


+3 É Rob while 6R2 = —4/ É Rob, that is the body has oblate deformation with the symmetry axis along 


the 2 axis. The same oblate shape is obtained for y = 7 and y = on with the oblate symmetry axes along 


the 3 and 1 azes respectively. For other values of y the shape is ellipsoidal. 
For the asymmetric deformed rigid body, the rotational Hamiltonian can be expressed in the form[Dav58] 


axis frame, (primed frame), 


3 2 
|R] 

j 2 ABB? sin*(y — 2h) 
where the rotational angular momentum is R. The principal moments of inertia are related by the triaxiality 
parameter y! which they assumed is identical to the shape parameter y. For axial symmetry the moment of 
inertia about the symmetry axis is taken to be zero for a quantal system since rotation of the potential well 
about the symmetry axis corresponds to no change in the potential well, or corresponding rotation of the bound 
nucleons. That is, the nucleus is not a rigid body, the nucleons only rotate to the extent that the ellipsoidal 
potential well is cranked around such that the nucleons must follow the rotation of the potential well. In 
addition, vibrational modes coexist about the average asymmetric deformation, plus octupole deformation 
often coexists with the above quadrupole deformed modes. 
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13.23 Symmetric rigid rotor subject to torque about a fixed point 


The motion of a symmetric top rotating in a gravitational field, with 
one point at a fixed location, is encountered frequently in rotational 
motion. Examples are the gyroscope and a child’s spinning top. 
Rotation of a rigid rotor subject to torque about a fixed point, is a 
case where it is necessary to take the inertia tensor with respect to 
the fixed point in the body, and not at the center of mass. 

Consider the geometry, shown in figure 13.7, where the symmet- 
ric top of mass M is spinning about a fixed tip that is displaced by 
a distance h from the center of mass. The tip of the top is assumed 
to be at the origin of both the space-fixed frame (x,y,z) and the 
body-fixed frame (1,2,3). Assume that the translational velocity 
is zero and let the principal moments about the fixed point of the 
symmetric top be Iı = Ig H Is. 

The Lagrange equations of motion can be derived assuming that 
the kinetic energy equals the rotational kinetic energy, that is, it is 
assumed that the translational kinetic energy Trans = 0. Then the 
kinetic energy of an inertially-symmetric rigid rotor can be derived 
for the torque-free symmetric e as given in as 13.145 to be Line of nodes 


T = 5D le? = = 3h (wi +3) + 5 Taw? (13.174) 


i y 1 : .\2 
= 5h (9 sin“0+0 J + 3B (o cos 0 + b) (13.175) Figure 13.7: Symmetric top spinning 
z S ; . about one fixed point. 
Since the potential energy is U = Mghcos@ then the Lagrangian 


equals 
1 12.9 .2 1 f .1 2 
L=5h (o sin? 0 +0 ) +5b (coso oh 1) — Mghcos0 (13.176) 
The angular momentum about the space-fixed z axis py is conjugate to ¢. From Lagrange’s equations 
OL 

dy = — =0 13.177 
Po = 5G ( ) 

that is, py is a constant of motion given by the generalized momentum 

OL 

Pe = Be = (I sin? 6 + Iz cos? 0) $ + Ist)cos@ = Sz = constant (13.178) 


where S, is the angular momentum projection along the space-fixed z axis. 
Similarly, the angular momentum about the body-fixed 3 axis is conjugate to y. From Lagrange’s equations, 


OL 
rp = — =0 13.179 
Pu = 55 ( ) 
that is, py is a constant of motion given by the generalized momentum 
OL : : 
Py = ae = Ís (è cos 0 + 5) = B = constant (13.180) 


where B3 is the angular momentum projection along the body-fixed 3 axis. The above two relations can be 
solved to give the precessional angular velocity ¢ about the space-fixed z axis 


_ P¢—pycosd _ Sz — B3cosé 


13.181 
l sin? 0 l sin? 0 ( ) 

and the spin angular velocity w about the body-fixed x3 axis 
ae (pe — pet _ B3 _ (Sz — B3 cos 6) cos 0 (13.182) 


I3 I, sin? 6 I3 L sin? 6 
Since pg and py are constants of motion, i.e. $3, B3, then these rotational angular velocities depend on only 
L,I. and 8. 
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There is one further constant of motion available if no frictional 
forces act on the system, that is, energy conservation. This implies 
that the total energy 


1 . 1 . ¿592 
E= 3h G sin? 0 + i”) + 5/3 (dcos 4 + Ù) + Mghcos@ (13.183) 


will be a constant of motion. But the middle term on the right-hand 


side also is a constant of motion v(0) 
1 a x2 2 B2 
=I (o cos 6 + 1) a constant (13.184) 
2 Iz L3 
Thus energy conservation can be rewritten by defining an energy E’ 
where 
a r2, Les ih. 5 e ; 00, O, O, TU 
ES (o sin“ 0 + )+Mgh cos 0 = constant (13.185) O 
This can be written as 
al 2 . . . . 
ps Lra ts (ps — Ey a 9) + Mghcos0 (13.186) Figure 13.8: Effective potential dia- 
2 2I sin? 0 gram for a spinning symmetric top 
as a function of theta. 
which can be expressed as 
pee, el es 32 
B= 3110 + V (0) (13.187) 
where V (0) is an effective potential 
- 0)? S- — B; cos 0)? 
Vi (gy PP ihota SESE)" highest (13.188) 


21, sin? 9 21, sin? 9 
The effective potential V (0) is shown in figure 13.8. It is clear that the motion of a symmetric top with 
effective energy E’ is confined to angles 01 < 0 < 0». 

Note that the above result also is obtained if the Routhian is used, rather than the Lagrangian, as 
mentioned in chapter 8.7, and defined by equation (8.65). That is, the Routhian can be written as 


RO, Ô, PP )cyclic = Ope an Ypy -L= H(¢, Po, Y, Pu) = L(0, edi 
= lpo p Pe pvcos6)" | PE, aroh cond (13.189) 
oe 21, sin? 0 ee l 


The Routhian R(0, 0, Pay )cyctic acts like a Hamiltonian for the (¢,py) and (Y, py) variables which are 
constants of motion, and thus are ignorable variables. The Routhian acts as the negative Lagrangian for the 


2 

remaining variable 0, with rotational kinetic energy 31 19 and effective potential energy Ves 
2 2 

(pp — py cos0) Py Pi 


+ = + Mghcosé = V (0) + = 
21; sin? 0 T3 + ee ( )+ L3 


Veff = 
The equation of motion describing the system in the rotating frame is given by one Lagrange equation 


d 
dt 


OReyclic _ OReyclic 


a O 


( 


The negative sign of the Routhian cancels out when used in the Lagrange equation. Thus, in the rotating 
frame of reference, the system is reduced to a single degree of freedom, the nutation angle 6, with effective 
energy E” given by equations 13.186 — 13.188. 
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(a) (b) (c) 


Figure 13.9: Nutational motion of the body-fixed symmetry axis projected onto the space-fixed unit sphere. 


The three case are (a) @ never vanishes, (b) = 0 at 0 = 02 (c) $ changes sign between 0, and 02, 


The motion of the symmetric top is simplest at the minimum value of the effective potential curve, where 
E' = Vmin, at which the nutation 6 is restricted to a single value 6 = 09. The motion is a steady precession 


at a fixed angle of inclination, that is, the “sleeping top”. Solving for (4) 6=0) = 0 gives that 


2 
Py Sin” ĝo 4M gh, cos ĝo 
= 0 = ———— ]1+,/1 13.1 
Po — Py COS 2038 | i A (13.190) 


If 0, < 5, then to ensure that the solution is real requires a minimum value of the angular momentum on the 
body-fixed axis of p? > 4MghI, cos 9. If 09 > 3 then there is no minimum angular momentum projection 
on the body-fixed axis. There are two possible solutions to the quadratic relation corresponding to either a 
slow or fast precessional frequency. Usually the slow precession is observed. 

For the general case, where Ej > Vmin, the nutation angle 6 between the space-fixed and body-fixed 3 
axes varies in the range 01 < 6 < 02. This axis exhibits a nodding variation which is called nutation. Figure 
13.9 shows the projection of the body-fixed symmetry axis on the unit sphere in the space-fixed frame. Note 
that the observed nutation behavior depends on the relative sizes of py and py cos 9. For certain values, the 
precession % changes sign between the two limiting values of 0 producing a looping motion as shown in figure 
13.9c. Another condition is where the precession is zero for 2 producing a cusp at 02 as illustrated in figure 
13.9b. This behavior can be demonstrated using the gyroscope or the symmetric top. 


13.12 Example: The Spinning “Jack” 


The game “Jacks” is played using metal Jacks, each of which com- 
prises siz equal masses m at the opposite ends of orthogonal axes of length 
l. Consider one jack spinning around the body-fired 3— axis with the lower 
mass at a fixed point on the ground, and with a steady precession around 3 
the space-fized vertical axis z with angle 0 as shown. Assume that the 
body-fized axes align with the arms of the jack. 

The principal moments of inertia about one mass is given by the par- S 
allel axis theorem to be Ip = I, = 4ml? +6m1? = 10ml? and Iz = 4ml?. 

In the rotating body-fixed frame the torque due to gravity has compo- 
nents 

6mgl sin 0 sin y 
N= | 6mglsin0cos y 
0 


and the components of the angular velocity are 


ósin O sin y + 0 cos 1) O 
w= (sin O cosy — O sin Jack comprises six bodies of 
cos +1) mass m at each end of 


orthogonal arms of length l 
Using Euler’s equations (13.103) for the above components of N and 


w in the body-fixed frame, gives 
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1001 — 6w.w3z = ês sin 6 sin y (a) 

: 6g . 
10w —6wyw3 = T sin 0 cos y (b) 
dis = 0 (c) 


Equation (c) relates the spin about the 3 axis, the precession, and the angle to the vertical 0, that is 
w3 = dcos6 +Y = Qcosd + s = constant 
where w) = s is the spin and $ =Q. is the precession angular velocity. 


If the spin axis is nearly vertical, 0 ~ 0 and thus sinó ~ 0 and cos@ ~ 1. Multiply equation (a) x sin Y + 
(b) x cos y and using the equations of the components of w gives 


sb + (209-300 - 4) 00 


The bracket must be positive to have stable sinusoidal oscillations. That is, the spin angular velocity s 
required for the jack to spin about a stable vertical axis is given by. 


This example illustrates the conditions required for stable rotation of any axially-symmetric top. 


13.13 Example: The Tippe Top 


The Tippe Top comprises a section of a sphere, to 
which a short cylindrical rod is mounted on the planar 


section, as illustrated. When the Tippe Top is spun on a. Ž 
a horizontal surface this top exhibits the perverse behav- ; 
ior of transitioning from rotation with the spherical head Fax 


resting on the horizontal surface, to flipping over such 
that it rotates resting on its elongated cylindrical rod. 
The orientation of angular momentum remains roughly 
vertical as expected from conservation of angular mo- 
mentum. This implies that the rotation with respect to 
the body-fixed axes must invert as the top inverts. The 
center of mass is raised when the top inverts; the addi- 
tional potential energy is provided by a reduction in the 


rotational kinetic energy. 
The Tippe Top behavior was first discovered in the 


1890’s but adequate solutions of the equations of motion 
have only been developed since the 1950’s. Since the top 
precesses around the vertical axis, the point of contact is 
not on the symmetry axis of the top. Sliding friction be- 
tween the surface of the spinning top and the horizontal 
surface provides a torque that causes the precession of 
the top to increase and eventually flip up onto the cylin- 
drical peg. The Tippe Top is typical of many phenomena 
in physics where the underlying physics principle can be 
recognized but a detailed and rigorous solution can be complicated. 

The system has five degrees of freedom, x,y which specify the location on the horizontal plane, plus the 
three Euler angles (y,0,¢). The paper by Cohen[Coh 77] explains the motion in terms of Euler angles using 
the laboratory to body-fized transformation relation. It shows that friction plays a pivotal role in the motion 
contrary to some earlier claims. Ciocci and Langerock[Cio07] used the Routhian Reyctic to reduce the number 


The geometry of the Tippe Top of radius R 
spinning on a horizontal surface with slipping 
friction acting between the top and the 
horizontal plane. The center of mass is a distance 
a from the center of the spherical section along 
the axis of symmetry of the top. 
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of degrees of freedom from 5 to 2, namely 0 which is the tilt angle, and p' which is the orientation of the 
tilt. This Routhian Reyctic is a Lagrangian in two dimension that was used to derive the equations of motion 
via the Lagrange Euler equation 


d OReyclic OReyclic 

dt ( 00 ) 00 Qo 
d OReyclic OReyclic a 
ag ap = 


where the Qo Qy are generalized torques about the 2 angles that take into account the sliding frictional 
forces. This sophisticated Routhian reduction approach provides an exhaustive and refined solution for the 
Tippe Top and confirms that sliding friction plays a key role in the unusual behavior of the Tippe Top. 


13.24 The rolling wheel 


As discussed in chapter 5.7, the rolling wheel is a non-holonomic system that is simple in principle, but 
in practice the solution can be complicated, as illustrated by the Tippe Top. Chapter 13.23 discussed the 
motion of a symmetric top rotating about a fixed point on the symmetry axis when subject to a torque. The 
rolling wheel involves rotation of a symmetric rigid body that is subject to torques. However, the point of 
contact of the wheel with a static plane is on the periphery of the wheel, and friction at the point of contact 
is assumed to ensure zero slip. Note that friction is necessary to ensure that the rotating object rolls without 
slipping, but the frictional force does no work for pure rolling of an undeformable rigid wheel. 

The coordinate system employed is shown in Figure 13.10. For simplicity it is better to use a moving 
coordinate frame (1,2,3) that is fixed to the orientation of the wheel with the origin at the center of mass 
of the wheel, but this moving reference frame does not include the angular velocity 4 of the disk about the 
3 axis. That is, the moving (1, 2,3) frame has angular velocities 


w = 0 (13.191) 
wa = dsind 
w3 = ocosé 


The frame fixed in the rotating wheel must include the additional angular velocity of the disk w about the 
ê axis, that is 
Q = w=0 (13.192) 
Oo = W= ósin0 
Qs = ww+b=ó¿cos0+ 
where Q designates the angular velocity of the rotating disk, while w designates the rotation of the moving 
frame (1, 2, 3). 
The principle moments of inertia of a thin circular disk are related by the perpendicular axis theorem 


(chapter 13.9) 
side 


Since J, = J, for a uniform disk, therefore Is = 21. 
Equation 12.16 can be used to relate the vector forces F in the space-fixed frame to the rate of change 
of momenta in the moving frame (1, 2,3). 


F= Dopace = Pmoving +wxp (13.193) 


This leads to the following relations for the three components in the moving frame 


Fi = pi +wop3 — W3p2 (13.194) 
F,—Mgsin@ = po +wspi — wps 
F3— Mgcos@ = p3 + wyp2 — Wap1 
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Z 


(a) (b) 


Figure 13.10: Uniform disk rolling on a horizontal plane as viewed in the (a) fixed frame, and (b) rolling 
disk frame. The space-fixed axis system is (x, y, z), while the moving reference frame (1, 2, 3) is centered at 
the center of mass of the disk with the 1,2 axes in the plane of the disk. The disk is rotating with a uniform 
angular velocity y about the 3 axis and rolling in the direction that is at an angle ¢ relative to the x axis. 


where F, F2, F3 are the reactive forces acting shown in figure 13.10. 
Similarly, the torques N in the space-fixed frame can be related to the rate of change of angular momentum 


by 


N= PP. = PP +wx L (13.195) 
where L;= 1;0;. This leads to the following relations for the three torque equations in the moving frame 
Ni = —F3R = 101 + IgN3we — 12203 (13.196) 
No = 0= LO + 10103 — 30301 
N3 = F¡R= TsO + 20201 — LO we 
The rolling constraints are 
pı + MRR3 = 0 (13.197) 
p = 0 
p -MRQ, = 0 


where p; = Mv;. Combining equations 13.194, 13.196, 13.197 gives 

(L + MR?) O, + (13 + MR?) w293 — hwg —MgRcos0 (13.198) 
LÒ + Liw301 = L30103 = 0 
(13 + MR?) O3 + Inui M2 — (D + MR?) w29 0 


These are the torque equations about the point of contact O. 
Introduction of equations 13.191 and 13.192 into equation 13.198 expresses the equations of motion in 
terms of the Euler angles to be 


(I, + MR?) 6+ (Iz + MR?) bsind (dcos0 +ý) — 1,9 sindcosd = -MgRcosð (13.199) 
0 


hosing + 21,60 cos O — 130 (o cos 6 + 1) 


(Is + MR?) (90050 — d0sind +1) - MR%dsin@ = 0 
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Equations 13.199 are non-linear, and a closed-form solution is possible only for limited cases such as when 
0 = 90°. 
Note that the above equations of motion also can be derived using Lagrangian mechanics knowing that 


1 1 1 
L=5M (v? + v3 +43) + 3h (Q7 +03) + zB% — MgR cos 8 
The differential equations of constraint can be derived from equations 13.197 to be 


dx — Rcosġdp = 0 
dy —Rsingdy = 0 


Use of generalized forces plus the Lagrange-Euler equations (6.45) can be used to derive the equations of 
motion and solve for the components of the constraint force Fi, Fb, and F3. 


13.14 Example: Tipping stability of a rolling wheel 


A circular wheel rolling in a vertical plane at high angular velocity initially rolls in a straight line and 
remains vertical. However, below a certain angular velocity, gyroscopic forces become weaker and the wheel 
will tip sideways and veer rapidly from the initial direction. It is interesting to estimate the minimum angular 
velocity of the disk such that it does not start to tip over sideways. 

Note that equations 13.199 are satisfied for 0 = 5, 4 = 0 and y = Q3 = constant. Assume a small 
disturbance causes the tilt angle to be 0 = 5 +a where a is small and that ġ is non-zero but small, that is 
8 =a and b are small. Keeping only terms to first order in the third of equations 13.199, and integrating 
gives 


$cos 0 +Y = Qs (a) 
The first two of equations 13.198 become 
(1, + MR?) &+ (Is + MR?) 403 -— MgRa = 0 (b) 
Ló—-I034 = 0 (c) 
Integrating equation (c) gives ño 
o= L = (d) 
Inserting (d) into (b) gives 
(1, + MR?) &+ | (13 + MR?) 5o% — MgR|a=0 (e) 


Equation (e) has a stable oscillatory solution when the square bracket in positive, that is, 


Q2 > LMgR 


°° T+ ME) i 


which gives the minimum angular velocity required for stable rolling motion. For angular velocity less than the 
minimum, the square bracket in equation (e) is negative leading to an exponentially decaying and divergent 
solution. For a uniform disk the perpendicular axis theorem gives I3 = 21, = ¿MR? for which equation (f) 
gives 


Q2 g 
Therefore the critical linear velocity of the wheel is 
R 
v = RQ > m (h) 


The bicycle wheel provides a common example of the tipping of a rolling wheel. For the typical 0.35m 
radius of a bicycle wheel, this gives a critical velocity of v > 1.07m/s = 2.4mph.4 


4The stability of the bicycle is sensitive to the castor and other aspects of the steering geometry of the front wheel, in 
addition to the gyroscopic effects. Excellent articles on this subject have been written by D.E.H. Jones Physics Today 23(4) 
(1970) 34, and also by J. Lowell & H.D. McKell, American Journal of Physics 50 (1982) 1106. 
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13.15 Example: Pivoting 


A rolling and a pivoting body can lead to confusion as to whether to compute the angular momentum and 
kinetic energy with respect to the center of mass, or the point of contact on the circumference of the body for 
rolling, or of the pivot point for a fixed pivot. For pivoting or rolling of a wheel it is useful to compare the 
angular momentum and total energy computed with respect to (1) the center of mass of a cylinder and (2) 
with respect to the point of contact of the cylinder with the horizontal ground plane. 

Consider a cylinder of radius R and mass m pivoting about the point of contact with the plane with 
angular velocity w = $ where v is the instantaneous velocity of the center of mass. The angular momentum 
about the pivot point is 


Loivot =Rxv =Ipivotw 
The parallel-axis theorem relates the moment of inertia with respect to the pivot point and center of mass 
Ipivot = mR? + Lem 


The angular velocities of the center of mass, and about the center of mass, are identical since the pivot point 
is fixed, that is 


Wpivot = Wem = W 
Thus the angular momentum about the pivot point is given by the sum of the angular momenta 
Lyivot = Ipivotw = mR?w + Lem 


That is, the angular momentum is the sum of the angular momentum of the body about the center of mass, 
plus the angular momentum of the center of mass about the pivot point. This is an example of Chasles 
theorem. 

The kinetic energy is given only by the rotational energy since the pivot point is stationary 


1 1 1 1 1 
K Epivot = 5 pivot” = gm Rew + 5 emu = ¿mo + glen” 
That is, it equals the kinetic energy of rotation about the center of mass plus the instantaneous kinetic energy 
for translation of the center of mass in agreement with Chasles theorem. Thus, for pivoting, the angular 
momentum and kinetic energy are the same if evaluated using either center of mass coordinates or using the 
pivot point as the reference point. 


13.16 Example: Rolling 


Consider the same system except the cylinder is rolling without slipping on a plane. The subtle difference 
between pivoting and rolling is that the rolling point of contact and the center of mass are moving at the same 
velocity in contrast to pivoting where the point of contact is stationary. Thus for rolling there is no angular 
momentum of the center of mass with respect to the point of contact. Therefore the angular momentum about 
the instantaneous point of contact is 


Leolling a L pivot =F Lem = mR?0 + Lem =LemW 


That is, the angular momentum only includes the angular momentum about the center of mass which is 
smaller than the angular momentum for the same body pivoting about a point on the periphery of the cylinder. 
The kinetic energy is given by 


1 1 1 1 
KE pot = ¿mo? + g rolling” = ¿mo? + zlim” 
Thus the angular momentum is significantly smaller for rolling relative to pivoting of a given body, whereas 
the kinetic energy is the same for both rolling or pivoting of a given body. 
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13.25 Dynamic balancing of wheels 


For rotating machinery It is crucial that rotors be both statically and dynamically balanced. Static balance 
means that the center of mass is on the axis of rotation. Dynamic balance means that the axis of rotation is 
a principal axis. 

For example, consider the symmetric rotor that has its symmetry axis at an angle ¢ to the axis of rotation. 
In this case the system is statically balanced since the center of gravity is on the axis of rotation. However, 
the rotation axis is at an angle ¢ to the symmetry axis. This implies that the axle has to provide a torque 
to maintain rotation that is not along a principal axis. If you distort the front wheel of your car by hitting it 
sideways against the sidewalk curb, or if the wheel is not dynamically balanced, then you will find that the 
steering wheel can vibrate wildly at certain speeds due to the torques caused by dynamic imbalance shaking 
the steering mechanism. This can be especially bad when the rotation frequency is close to a resonant 
frequency of the suspension system. Insist that your automobile wheels are dynamically balanced when you 
change tires, static balancing will not eliminate the dynamic imbalance forces. Another example is that the 
ailerons, rudder, and elevator on aircraft usually are dynamically balanced to stop the build up of oscillations 
that can couple to flexing and flutter of the airframe which can lead to airframe failure. 


13.17 Example: Forces on the bearings of a rotating circular disk 


A homogeneous circular disk of mass M, and radius R, 
rotates with constant angular velocity w about a body-fixed 
axis passing through the center of the circular disk as shown 
in the adjacent figure. The rotation axis is inclined at an 
angle a to the symmetry axis of the circular disk by bearings 
on both sides of the disk spaced a distance d apart. Determine 
the forces on the bearings. 

Choose the body-fixed axes such that é3 is along the sym- 
metry axis of the circular disk, and €, points in the plane of 
the disk symmetry axis and the rotation axis. These axes are 
the principal axes for which the inertia tensor can be calcu- 
lated to be 


1 0 0 
M 2 
I= i 010 
0 0 2 
Note that for this thin plane laminae disk T11 + Ing = I33. Rotation of circular disk about an axis that 
The components of the angular velocity vector w along the is at an angle a to the symmetry axis of the 
three body-fixed axes are given by circular disk. 


w = (wsin a, 0, w cos q) 


Since it is assumed that w = 0 then substituting into Euler’s equations (13.103) gives the torques acting to 
be 


N, = N3=0 


1 
Na = —u*sinacos ¿MR? 


That is, the torque is in the êz direction. Thus the forces F on the bearings can be calculated since N =r x F, 
thus IN] na 
sin 2a 
F| = 2 = MR 
= aa Tod 
Estimate the size of these forces for the front wheel of your car travelling at TO m.p.h. if the rotation axis is 


displaced by 2° from the symmetry axis of the wheel. 
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Figure 13.11: Forward two-and-a-half somersaults with two twists demonstrates unequivocally that a diver 
can initiate continuous twisting in midair. In the illustrated maneuver the diver does more than one full 
somersault before he starts to twist. To maintain the twisting the diver does not have to move his legs.[Fro80] 


13.26 Rotation of deformable bodies 


The discussion in this chapter has assumed that the rotating body is a rigid body. However, there is a 
broad and important class of problems in classical mechanics where the rotating body is deformable that 
leads to intriguing new phenomena. The classic example is the cat, which, if dropped upside down with zero 
angular momentum, is able to distort its body plus tail in order to rotate such that it lands on its feet in 
spite of the fact that there are no external torques acting and thus the angular momentum is conserved. 
Another example is the high diver doing a forward two-and-a-half somersault with two twists.[Fro80] Once 
the diver leaves the board then the total angular momentum must be conserved since there are no external 
torques acting on the system. The diver begins a somersault by rotating about a horizontal axis which is a 
principal axis that is perpendicular to the axis of his body passing through his hips. Initially the angular 
momentum, and angular velocity, are parallel and point perpendicular to the symmetry axis. Initially the 
diver goes into a tuck which greatly reduces his moment of inertia along the axis of his somersault which 
concomitantly increases his angular velocity about this axis and he performs one full somersault prior to 
initiating twisting. Then the diver twists its body and moves its arms to destroy the axial symmetry of his 
body which changes the direction of the principal axes of the inertia tensor. This causes the angular velocity 
to change in both direction and magnitude such that the angular momentum remains conserved. The angular 
velocity now is no longer parallel to the angular momentum resulting in a component along the length of 
the body causing it to twist while somersaulting. This twisting motion will continue until the symmetry 
of the diver’s body is restored which is done just before entering the water. By skilled timing, and body 
movement, the diver restores the symmetry of his body to the optimum orientation for entering the water. 
Such phenomena involving deformable bodies are important to motion of ballet dancers, jugglers, astronauts 
in space, and satellite motion. The above rotational phenomena would be impossible if the cat or diver were 
rigid bodies having a fixed inertia tensor. Calculation of the dynamics of the motion of deformable bodies 
is complicated and beyond the scope of this book, but the concept of a time dependent transformation of 
the inertia tensor underlies the subsequent motion. The theory is complicated since it is difficult even to 
quantify what corresponds to rotation as the body morphs from one shape to another. Further information 
on this topic can be found in the literature. [Fro80] 
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13.27 Summary 


This chapter has introduced the important, topic of rigid-body rotation which has many applications in 
physics, engineering, sports, etc. 


Inertia tensor The concept of the inertia tensor was introduced where the 9 components of the inertia 


tensor are given by 
3 
Ij = 0 p(r’) (a (> a) - za) dV (13.14) 
k 
Steiner’s parallel-axis theorem 


Jı = hı +M ((aj + a3 + 43) 611 — a?) = hı +M (a5 + a3) (13.43) 


relates the inertia tensor about the center-of-mass to that about parallel axis system not through the center 
of mass. 

Diagonalization of the inertia tensor about any point was used to find the corresponding Principal axes 
of the rigid body. 


Angular momentum The angular momentum L for rigid-body rotation is expressed in terms of the 
inertia tensor and angular frequency w by 


In h2 hs Wy 
L= Ig, Lo L3 $ W2 = {I} "Ww (13.56) 
Isı [32 l33 w3 


Rotational kinetic energy The rotational kinetic energy is 


1 hi h2 hs wy 
Trot = =( 41 w2 wg): | la be la |-| we (13.72) 
2 
Isı l32 Ig w3 
1 1 
Trt = T= gv o = ¿Web (13.73) 


Euler angles The Euler angles relate the space-fixed and body-fixed principal axes. The angular velocity 
w expressed in terms of the Euler angles has components for the angular velocity in the body-fixed axis system 
(1,2,3) 


wi = 04,+0,+4%, =ósin0sinY +0cos y (13.86) 
wo = d.+0. += bsind cosy — siny (13.87) 
wz = d3+03+%3 =0c080 +p (13.88) 
Similarly, the components of the angular velocity for the space-fixed axis system (x,y,z) are 
ws = Ocosd+psindsing (13.89) 
wy = Osingdg—wsindcos¢ (13.90) 
we = b+cosd (13.91) 


Rotational invariants The powerful concept of the rotational invariance of scalar properties was intro- 
duced. Important examples of rotational invariants are the Hamiltonian, Lagrangian, and Routhian. 


Euler equations of motion for rigid-body motion The dynamics of rigid-body rotational motion was 
explored and the Euler equations of motion were derived using both Newtonian and Lagrangian mechanics. 


Ne = Tw, = (Ip = Iz) wows (13.103) 
Net = bw — (I5 — 11) w3w1 
NES = [303 = (L T Ip) wwe 
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Lagrange equations of motion for rigid-body motion The Euler equations of motion for rigid-body 
motion, given in equation 13.103, were derived using the Lagrange-Euler equations. 


Torque-free motion of rigid bodies The Euler equations and Lagrangian mechanics were used to study 
torque-free rotation of both symmetric and asymmetric bodies including discussion of the stability of torque- 
free rotation. 


Rotating symmetric body subject to a torque The complicated motion exhibited by a symmetric top, 
that is spinning about one fixed point and subject to a torque, was introduced and solved using Lagrangian 
mechanics. 


The rolling wheel The non-holonomic motion of rolling wheels was introduced, as well as the importance 
of static and dynamic balancing of rotating machinery.. 


Rotation of deformable bodies The complicated non-holonomic motion involving rotation of deformable 
bodies was introduced. 
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Workshop exercises 


1. Three objects are described below. Break up into three groups, one group per object, and determine the inertia 
tensor. 


e A very thin sheet with a mass density 0 = Cxy where C is a positive constant. The sheet lies in the xy 
plane and its sides are both of length a. 


e An inclined-plane shaped block of mass M is oriented with one corner at the origin as shown. 


e An equilateral triangle made up of three thin rods of length l and uniform mass density p. 
2. Consider the objects described in problem 1. 


(a) For the first object (the thin sheet), determine the principal moments of inertia. 
(b) For the second object (the inclined plane), determine the principal axes. 


(c) For the third object (the equilateral triangle), determine the products of inertia. 
3. Consider the inertia tensor. 


(a) What are the advantages of diagonalizing the inertia tensor? 
(b) How can the inertia tensor be diagonalized? 


(c) What can you say about a tensor that is real and symmetric? 
4. A hollow spherical shell has a mass m and radius R. 


(a) Calculate the inertia tensor for a set of coordinates whose origin is at the center of mass of the shell. 


(b) Now suppose that the shell is rolling without slipping toward a step of height h, where h < R. The shell 
has a linear velocity v. What is the angular momentum of the shell relative to the tip of the step? 


(c) The shell now strikes the tip of the step inelastically (so that the point of contact sticks to the step, 
but the shell can still rotate about the tip of the step). What is the angular momentum of the shell 
immediately after contact? 


(d) Finally, find the minimum velocity which enables the shell to surmount the step. Express your result in 
terms of m, g, R, and h. 


5. The vectors ĉ, Y, and Z constitute a set of orthogonal right-handed axes. The vectors E + y — 22, —ĉê + ĝ, and 
ĉ + + Z are also perpendicular to one another. 


(a) Write out the set of direction cosines relating the new axes to the old. 


(b) How are the Eulerian angles defined? Describe this transformation by a set of Eulerian angles. 


360 


CHAPTER 13. RIGID-BODY ROTATION 


6. A torsional pendulum consists of a vertical wire attached to a mass which can rotate about the vertical axis. 


Consider three torsional pendula which consist of identical wires from which identical homogeneous solid cubes 
are hung. One cube is hung from a corner, one from midway along an edge, and one from the middle of a face 
as shown. What are the ratios of the periods of the three pendula? 


A dumbbell comprises two equal point masses M connected by a massless rigid rod of length 2A which is 
constrained to rotate about an axle fixed to the center of the rod at an angle @ as shown in the figure. The 
center of the rod is at the origin of the coordinates, the axle along the z-axis, and the dumbbell lies in the 
x — y plane at t = 0. The angular velocity w is a constant in time and is directed along the z axis. 


a) Calculate all elements of the inertia tensor. Be sure to specify the coordinate system used. 


b) Using the calculated inertia tensor find the angular momentum of the dumbbell in the laboratory frame as 
a function of time. 


c) Using the equation L = r x p, calculate the angular momentum and show that it it is equal to the answer 
of part (b). 


d) Calculate the torque on the axle as a function of time. 


e) Calculate the kinetic energy of the dumbbell. 


A heavy symmetric top has a mass m with the center of mass a distance h from the fixed point about which 
it spins and Jı = Iz Æ I3. The top is precessing at a steady angular velocity Q about the vertical space-fixed 
z axis. What is the minimum spin w’ about the body-fixed symmetry axis, that is, the 3 axis assuming that 
the 3 axis is inclined at an angle 0 = 0 with respect to the vertical z axis. Solve the problem at the instant 
when the z, 2,3, 1 axes all are in the same plane as shown in the figure. 
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9. Consider an object with the center of mass is at the origin and inertia tensor, 


1/2 —1/2 0 
I=I| -1/2 1/2 0 
0 0 1 


(a) Determine the principal moments of inertia and the principal axes. Guess the object. 


(b) Determine the rotation matrix R and compute R'TIR. Do the diagonal elements match with your results 
from (a)? Note: columns of R are eigenvectors of J. 


(c) Assume w = 2(% + 2). Determine L in the rotating coordinate system. Are L and w in the same 
direction? What does this mean? 
(d) Repeat (c) for w = walt — Y). What is different and why? 


(e) For which case will there be a non-zero torque required? 


(£) Determine the rotational kinetic energy for the case w = (4 — $)? 


10. Consider a wheel (solid disk) of mass m and radius r. The wheel is subject to angular velocities w4 = wan 
where Ĥ is normal to the surface and wg = wp Z. 


x? 


(a) Choose a set of principal axes by observation. 


(b) Determine the angular velocities and angular momentum along the principal axes. Note: = imr? and 


Ip = L3 = imr’. 


(c) Determine the torque. 


(d) Determine the rotation matrix that rotates the fixed coordinate system to the body coordinate system. 


11. Determine the principal moments of inertia of an ellipsoid given by the equation, 


12. Determine the principal moments of inertia of a sphere of radius R with a cavity of radius r located € from the 
center of the sphere. 


13. Three equal masses m form the vertices of an equilateral triangle of side length L. The masses are located at 


(0, 0, Le), (o, E, - se), and (0, -&, — ss): such that the center-of-mass is located at the origin. 


(a) Determine the principal moments of inertia and principal axes. 

Now consider the same system rotated 45° about the 2-axis. The masses are located at (0, 0, Le) ' (- 37 oot 
L L _L . a 

and (ss. 2/3? s+) , respectively. 


(b) Determine the principal moments of inertia and principal axes. 
(c) Could you have answered (b) without explicitly determining the inertia tensor? How? 
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Problems 


1. Calculate the moments of inertia J1, I2, [3 for a homogeneous cone of mass M whose height is h and whose 
base has a radius R. Choose the x3-axis along the symmetry axis of the cone. 


a) Choose the origin at the apex of the cone, and calculate the elements of the inertia tensor. 
b) Make a transformation such that the center of mass of the cone is the origin and find the principal moments 
of inertia. 

2. Four masses, all of mass m, lie in the x — y plane at positions (x,y) = (a,0), (—a,0), (0, +2a), (0, —2a). 
These are joined by massless rods to form a rigid body 
(a) Find the inertial tensor, using the x,y,z axes as a reference system. Exhibit the tensor as a matrix. 


(b) Consider a direction given by the unit vector n that lies equally between the positive x, y, z axes; that is 
it makes equal angles with these three directions. Find the moment of inertia for rotation about this ù axis. 


(c) Given that at a certain time t the angular velocity vector lies along the above direction 7, find, for that 
instant, the angle between the angular momentum vector and 7’. 


3. A homogeneous cube, each edge of which has a length J, initially is in a position of unstable equilibrium with 
one edge of the cube in contact with a horizontal plane. The cube then is given a small displacement causing 
it to tip over and fall. Show that the angular velocity of the cube when one face strikes the plane is given by 


a? = AS (v2 -1) 


where A = 3 if the edge cannot slide on the plane, and where A = 2 if sliding can occur without friction. 


4. A symmetric body moves without the influence of forces or torques. Let £3 be the symmetry axis of the body 
and L be along 13. The angle between w and 23 is a. Let w and L initially be in the x2 — x3 plane. What is 
the angular velocity of the symmetry axis about L in terms of J4, I3,w,and a? 


5. Consider a thin rectangular plate with dimensions a by b and mass M. Determine the torque necessary to 
rotate the thin plate with angular velocity w about a diagonal. Explain the physical behavior for the case when 
a=b. 


Chapter 14 


Coupled linear oscillators 


14.1 Introduction 


Chapter 3 discussed the behavior of a single linearly-damped linear oscillator subject to a harmonic force. 
No account was taken for the influence of the single oscillator on the driver for the case of forced oscillations. 
Many systems in nature comprise complicated free or forced oscillations of coupled-oscillator systems. Ex- 
amples of coupled oscillators are; automobile suspension systems, electronic circuits, electromagnetic fields, 
musical instruments, atoms bound in a crystal, neural circuits in the brain, networks of pacemaker cells in 
the heart, etc. Energy can be transferred back and forth between coupled oscillators as the motion evolves. 
It is possible to describe the motion of coupled linear oscillators in terms of a sum over independent normal 
coordinates, i.e. normal modes, even though the motion may be very complicated. These normal modes 
are constructed from the original coordinates in such a way that the normal modes are uncoupled. The 
topic of finding the normal modes of coupled oscillator systems is a ubiquitous problem encountered in all 
branches of science and engineering. As discussed in chapter 3, oscillatory motion of non-linear systems 
can be complicated. Fortunately most oscillatory systems are approximately linear when the amplitude of 
oscillation is small. This discussion assumes that the oscillation amplitudes are sufficiently small to ensure 
linearity. 


14.2 Two coupled linear oscillators 


Consider the two-coupled linear oscillator, shown in figure 
14.1, which comprises two identical masses each connected to 
fixed locations by identical springs having a force constant | fi — CL >< l. | 
k. A spring with force constant «’ couples the two oscilla- EN po 
tors. The equilibrium lengths of the outer two springs are l kwwwwoO-www—O0-ww| 
while that of the coupling spring is l’. The problem is simpli- 
fied by restricting the motion to be along the line connecting 
the masses and assuming fixed endpoints. The small displace- [R nm 
ments of mı and ma are taken to be x; and z2 with respect to 
the equilibrium positions l and l+ l’ respectively. The restor- 
ing force on my is —kx1—k! (x1 — 22) while the restoring force 
on ma is —K£2 — kK! (12 — £1). This coupled double-oscillator 
system exhibits basic features of coupled linear oscillator sys- 
tems. 

Assuming mı = ma = m, then the equations of motion 
are 


Figure 14.1: Two coupled linear oscillators. 
The equilibrium spring-lengths are l for the 
outer springs and l’ for the coupling spring. 
The displacement from the stable locations 
are given by xı and x2. The separation be- 
m#+(K+4')a,—K't2 = 0 (14.1) tween the two masses is r and the location of 
the center-of-mass is Rem. 


më + (kK +kK')zr2— Kz = 0 
Assume that the motion for these coupled equations is oscil- 
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latory with a solution of the form 
sa = Bye (14.2) 
la = Bo et 


where the constants B may be complex to take into account both the magnitude and phase. Substituting 
these possible solutions into the equations of motion gives 


—mu* Bye"? + (K +K) Bye — e! Bye = 0 (14.3) 


—mw? Boe! + (k i K!) Boe E k' Bye’! = 0 


Collecting terms, and cancelling the common exponential fac- 
tor, gives 


(k +K — mu?) By — K’ By 0 (14.4) 


(«+ kK’ — mw”) Boa-K'B = 0 


The existence of a non-trivial solution of these two simultane- 
ous equations requires that the determinant of the coefficients of 
Bı and B must vanish, that is 


k +K — mw —k! 
=K! n= | 25) 
The expansion of this secular determinant yields 2 
(K+K — mw)? — 6? =0 (14.6) 
Solving for w gives 
PL yl 
gj TRER (14.7) 
m 
That is, there are two characteristic frequencies (or eigenfrequen- 
cies) for the system 


2k! 
u = e (14.8) 
m 


w = |2 (14.9) 
m Figure 14.2: Displacement of each of two 
Since superposition applies for these linear equations, then the coupled linear harmonic oscillators with 
general solution can be written as a sum of the terms that account « = 4 and x’ = 1 in relative units. 
for the two possible values of w. 


Figure 14.2 shows the solutions for a case where k = 4 and K’ = 1, in arbitrary units, with the initial 


condition that x2 = D, and 11 = tı = ta = 0. The two characteristic frequencies are wi = 4/ £ and 


we = 4/ 4. The characteristic beats phenomenon is exhibited where the envelope over one complete cycle of 
the low frequency encompasses several higher frequency oscillations. That is, the solution is 


D.. ; : ; — 
29 (t) = zz asis a e wit + prat at ae = Dcos (23) J cos (232) J (14.10) 
while 
D.. , , , = 
21 (t) — d eo + e wit _ piwat _ ee = Dsin (AS) "| sin (232) J (14.11) 


The energy in the two-coupled oscillators flows back and forth between the coupled oscillators as illus- 
trated in figure 14.2. 

A better understanding of the energy flow occurring between the two coupled oscillators is given by 
using a (11,12) configuration-space plot, shown in figure 14.3. The flow of energy occurring between the two 
coupled oscillators can be represented by choosing normal-mode coordinates 7, and na that are rotated by 
45° with respect to the spatial coordinates (11,12). These normal-mode coordinates (7,7) correspond to 
the two normal modes of the coupled double-oscillator system. 


14.3. NORMAL MODES 


14.3 Normal modes 


The normal modes of the two-coupled oscillator system are 
obtained by a transformation to a pair of normal coordinates 
(11,2) that are independent and correspond to the two normal 
modes. The pair of normal coordinates for this case are 


Mm = 1-22 (14.12) 
No = Ti +T2 
that is 
1 
T = 3 (na +1) (14.13) 
T2 = 5 (n2 — M1) 


Substitute these into the equations of motion (14.1), gives 


mí + iia) +(5+2) 9 +K = 0 (14.14) 
m (ih =ñ) + (542) 9, 92 = 


Adding and subtracting these two equations gives 


mi + (+25) = (14.15) 


0 
mña + Ko = 0 
Note that the two coordinates 7, and 7, are uncoupled and there- 
fore are independent. The solutions of these equations are 


mí) = Otet + ore 
tat) = Otet 403e 
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Figure 14.3: Motion of two coupled har- 
monic oscillators in the (21,22) spatial 
configuration space and in terms of the 
normal modes (74,2). Initial conditions 
are Ta D, Ly Ly La 0. 


iwat (14.16) 


twat 


where 7, corresponds to angular frequencies w1, and Na corresponds to w2. The two coordinates 7, and no are 
called the normal coordinates and the two solutions are the normal modes with corresponding 


angular frequencies, wı and wa. 

The (71,72) axes of the two normal modes correspond to a 
rotation of 45° in configuration space, figure 14.3. The initial 
conditions chosen correspond to 7, = —N2 and thus both modes 
are excited with equal intensity. Note that there are 5 lobes along 
the 7, axis versus 4 lobes along the 7, axis reflecting the ratio 
of the eigenfrequencies w and wa. Also note that the diamond 
shape of the motion in the (x1, x2) configuration space illustrates 
that the extrema amplitudes for x2 are a maximum when zı is 
zero, and vise versa. This is equivalent to the statement that 
the energies in the two modes are coupled with the energy for 
the first oscillator being a maximum when the energy is a min- 
imum for the second oscillator, and vise versa. By contrast, in 
the (7,2) configuration space, the motion is bounded by a rec- 
tangle parallel to the (7,,72) axes reflecting the fact that the 
extrema amplitudes, and corresponding energies, for the 7, nor- 
mal mode are constant and independent of the motion for the nə 
normal mode, and vise versa. The decoupling of the two normal 
modes is best illustrated by considering the case when only one 
of these two normal modes is excited. For the initial conditions 
x1 (0) = —z2 (0), and x, (0) = — z2 (0), then nə (t) = 0. That is, 
only the 7, (t) normal mode is excited with frequency wi which 
corresponds to motion confined to the 7, axis of figure 14.3. 


Antisymmetric mode 
(out of phase) 


Symmetric mode 
(in phase) 


Figure 14.4: Normal modes for two cou- 
pled oscillators. 
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As shown in figure 14.4, 7, (t) is the antisymmetric mode in which the two masses oscillate out of phase 
such as to keep the center of mass of the two masses stationary. For the initial conditions xı (0) = x2 (0), 
and #1 (0) = x2 (0), then 7, (t) = 0, that is, only the nə (t) normal mode is excited. The 7 (t) normal mode 
is the symmetric mode where the two masses oscillate in phase with frequency wa; it corresponds to motion 
along the 7, axis. For the symmetric phase, both masses move together leading to a constant extension of 
the coupling spring. As a result the frequency wa of the symmetric mode nə (t) is lower than the frequency 
w1 of the asymmetric mode 7, (t). That is, the asymmetric mode is stiffer since all three springs provide 
active restoring forces, compared to the symmetric mode where the coupling spring is uncompressed. In 
general, for attractive forces the lowest frequency always occurs for the mode with the highest symmetry. 


14.4 Center of mass oscillations 


Transforming the coordinates into the center of mass of the two oscillating masses elucidates an interesting 
feature of the normal modes for the two-coupled linear oscillator. As illustrated in figure 14.1, the center- 
of-mass coordinate for the two mass system is 


2Rem =l +21 +1+U+22=21+U' +n 
while the relative separation distance is 
r=(L+l +z2)— (l+) =V -m 
That is, the two normal modes are 


m = l-r (14.17) 
No = 2Rem=21-I' 


The 7, mode, which has angular frequency w1 = 4/ ae corresponds to an oscillations of the relative 
separation r, while the center-of-mass location Rem is stationary. By contrast, the 7, mode, with angular 
frequency wa = VE , corresponds to an oscillation of the center of mass Rem with the relative separation r 
being a constant. 

Figure 14.5 illustrates the decoupled center-of-mass 
Rem, and relative motions r for both normal modes of 
the coupled double-oscillator system. The difference in 207 
angular frequencies and amplitudes is readily apparent. 

It is of interest to consider the special case where the 
spring constant « = 0 for the two outside springs. Then 
the angular frequencies are w1 = Ta and wa = 0 for 
the two normal modes. When « = 0 the nz mode is a 
spurious center-of- mass mode since it corresponds to an 
oscillation with wa = 0 in spite of the fact that there 
are no forces acting on the center of mass. That is, the PES a NEO E RR IRON 
center-of-mass momentum must be a constant of motion. t 
This spurious center-of-mass oscillation is a consequence 
of measuring the displacements (£1, £2) with respect to 
an arbitrary external reference that is not related to the 
center of mass of the coupled system. Spurious center- Figure 14.5: Time dependence of the center-of- 
of-mass modes are encountered frequently in many-body Mass Rem and relative separation r for two cou- 
coupled oscillator systems such as molecules and nuclei. pled linear oscillators assuming spring constants 
In such cases it is necessary to project out the center-of- of s =4M and x' = M. 
mass motion to eliminate such spurious solutions as will 
be discussed later. 
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14.5 Weak coupling 


If one of the two coupled linear oscillator masses is held fixed, then the other free mass will oscillate with a 

frequency. 

K+ kK! 
M 


The effect of coupling of the two oscillators is to split the degeneracy of the frequency for each mass to 


2k! / 
wa (os >= > ws = yf. (14.19) 


Thus the degeneracy is broken, and the two normal modes have frequencies straddling the single-oscillator 
frequency. 

It is interesting to consider the case where the coupling is weak because this situation occurs frequently 
in nature. The coupling is weak if the coupling constant k’ << «K. Then 


2k! 
ee 2 Zs [TE de (14.20) 


e= 7 <<l (14.21) 


w & re + 2e) (14.22) 


The natural frequency of a single oscillator was shown to be 


wo =4/ nan x aa +e) (14.23) 
VE = we (1 —e) (14.24) o 


Thus the frequencies for the normal modes for weak coupling 
can be written as o 


K 02 
wy = y + 22) n=2 


= wo(1—e)(1+2e) ewo(1 +e) (14.25) 


wo = (14.18) 


where 


Thus 


that is 


while 

wa = lz = wo (1 — €) (14.26) o; 
That is the two solutions are split equally spaced about the o, 0, 
single uncoupled oscillator value given by wo = E z w 
vīr (1 +e). Note that the single uncoupled oscillator fre- e 
quency wo depends on the coupling strength x’. n=3 


This splitting of the characteristic frequencies is a feature 
exhibited by many systems of n identical oscillators where 
half of the frequencies are shifted upwards and half down- 
ward. If n is odd, then the central frequency is unshifted as Figure 14.6: Normal-mode frequencies for 
illustrated for the case of n = 3. An example of this behav- 2=2 and n=3 weakly-coupled oscillators. 
ior is the Zeeman effect where the magnetic field couples the 
atomic motion resulting in a hyperfine splitting of the energy 
levels as illustrated. 
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There are myriad examples involving weakly-coupled oscillators in many aspects of the natural world. 
The example of collective modes in nuclear physics, illustrated in example 14.13, is typical of applications to 
physics, while there are many examples applied to musical instruments, acoustics, and engineering. Weakly- 
coupled oscillators are a dominant theme throughout biology as illustrated by congregations of synchronously 
flashing fireflies, crickets that chirp in unison, an audience clapping at the end of a performance, networks 
of pacemaker cells in the heart, insulin-secreting cells in the pancreas, and neural networks in the brain and 
spinal cord that control rhythmic behaviors such as breathing, walking, and eating. Synchronous motion of 
a large number of weakly-coupled oscillators often leads to large collective motion of weakly-coupled systems 
as discussed in chapter 14.12. 


14.1 Example: The Grand Piano 


Hitchpin 


Damper String Bridge 


AN S RS 
Soundboard SN va 


Ribs 


Schematic diagram of the action for a grand piano, including the strings, bridge and sounding board. Note 
that there are either two or three parallel strings per note that are hit by a single hammer. 


The grand piano provides an excellent example of a weakly-coupled harmonic oscillator system that has 
normal modes. There are either two or three parallel strings per note that are stretched tightly parallel to the 
top of the horizontal sounding board. The strings press downwards on the bridge that is attached to the top of 
the sounding board. The strings for each note are excited when struck vertically upwards by a single hammer. 
In the base section of the piano each note comprises two strings tuned to nearly the same frequency. The 
coupling of the motion of the strings is via the bridge plus sounding board. Normally, the hammer strikes both 
strings simultaneously exciting the vertical symmetric mode, not the vertical antisymmetric mode. The bridge 
is connected to the sounding board which moves the largest amount for the symmetric mode where both strings 
move the bridge in phase. This strong coupling produces a loud sound. The antisymmetric mode does not 
move the sounding board much since the strings at the bridge move out of phase. Consequently, the symmetric 
mode, that is strongly coupled to the sounding board, damps out more rapidly than the antisymmetric mode 
which is weakly coupled to the sound board and thus has a longer time constant for decay since the radiated 
sound energy is lower than the symmetric mode. 

The una-corda pedal (soft pedal) for a grand piano moves the action sideways such that the hammer strikes 
only one of the two strings, or two of the three strings, resulting in both the symmetric and antisymmetric 
modes being excited equally. The una-corda pedal produces a characteristically different tone than when the 
hammer simultaneously hits the coupled strings; that is, it produces a smaller transient component. The 
symmetric mode rapidly damps due to energy propagation by the sounding board. Thus the longer lasting 
antisymmetric mode becomes more prominent when both modes are equally excited using the una-corda pedal. 
The symmetric and antisymmetric modes have slightly different frequencies and produce beats which also 
contributes to the different timbre produced using the una-corda pedal. For the mid and upper frequency 
range, the piano has three strings per note which have one symmetric mode and two separate antisymmetric 
modes. To further complicate matters, the strings also can oscillate horizontally which couples weakly to the 
bridge plus sounding board. The strengths that these different modes are excited depend on subtle differences 
in the shape and roughness of the hammer head striking the strings. Primarily the hammer excites the two 
vertical modes rather than the horizontal modes. 
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14.6 General analytic theory for coupled linear oscillators 


The above discussion of a coupled double-oscillator system has shown that it is possible to select symmetric 
and antisymmetric normal modes that are independent and each have characteristic frequencies. The normal 
coordinates for these two normal modes correspond to linear superpositions of the spatial amplitudes of the 
two oscillators and can be obtained by a rotation into the appropriate normal coordinate system. Extension 
of this to systems comprising n coupled linear oscillators, requires development of a general analytic theory, 
that is capable of finding the normal modes plus their eigenvalues and eigenvectors. As illustrated for the 
double oscillator, the solution of many coupled linear oscillators is a classic eigenvalue problem where one has 
to rotate to the principal axis system to project out the normal modes. The following discussion presents a 
general approach to the problem of finding the normal coordinates for a system of n coupled linear oscillators. 

Consider a conservative system of n coupled oscillators, described in terms of generalized coordinates 
qk and t, with subscript k = 1, 2,3...n for a system with n degrees of freedom. The coupled oscillators are 
assumed to have a stable equilibrium with generalized coordinates qķo at equilibrium. In addition, it is 
assumed that the oscillation amplitudes are sufficiently small to ensure that the system is linear. 

For the equilibrium position qk = 4x0 the Lagrange equations must satisfy 


dk = 0 (14.27) 
dh = 0 
Every non-zero term of the form ode in Lagrange’s equations must contain at least either q, or qx; which 
are zero at equilibrium; thus all such terms vanish at equilibrium. That is at equilibrium 
OL oT OU 
(= a (=) g (Z) -0 (14.28) 
qk 0 qk 0 qk 0 


where the subscript 0 designates at equilibrium. 


14.6.1 Kinetic energy tensor T 


In chapter 7.6 it was shown that, in terms of fixed rectangular coordinates, the kinetic energy for N bodies, 
with n generalized coordinates, is expressed as 


IAR 
=> y y Made; (14.29) 


Expressing these in terms of generalized coordinates Za; = Ta ¡(q;, t) where j = 1, 2, ...n, then the generalized 
velocities are given by 


Olas. OL ai 
tas = Y) Ey E (14.30) 


As discussed in chapter 7.6, if the system is scleronomic then the partial time derivative 


OX ai 
Ot 


Thus the kinetic energy, equation 14.29, of a scleronomic system can be written as a homogeneous quadratic 
function of the generalized velocities 


=0 (14.31) 


Le Xs 
T =>» Tintite (14.32) 
j,k 
where the components of the kinetic energy tensor T are 
N 3 
oLa i Ola $ 
Tjk = Ma —— — 14.33 
i 2 i Og; 0% : 


Note that if the velocities ġ correspond to translational velocity, then the kinetic energy tensor T corresponds 
to an effective mass tensor, whereas if the velocities correspond to angular rotational velocities, then the 
kinetic energy tensor T corresponds to the inertia tensor. 
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It is possible to make an expansion of the Tj, about the equilibrium values of the form 


OT; 
Tjk (q1, 925 -dn) = Tin (qio) + 5 ( i) q+.. (14.34) 
l 0 
Only the first-order term will be kept since the second and higher terms are of the same order as the higher- 


order terms ignored in the Taylor expansion of the potential. Thus, at the equilibrium point, assume that 
(22) = 0 where k = 1,2,3,...n. 
qk} o 


14.6.2 Potential energy tensor V 
Equations 14.28 plus 14.34 imply that 


(=) =0 (14.35) 
04% 0 
where k = 1, 2,3,...n. 

Make a Taylor expansion about equilibrium for the potential energy, assuming for simplicity that the 
coordinates have been translated to ensure that qx = 0 at equilibrium. This gives 


1 0%U 
U (41,42, --4n Too ott +32 (Sram), et (14.36) 


3U 
04r 


potential can be measured with respect to Up. Assume that the amplitudes are small, then the expansion 
can be restricted to the quadratic term, corresponding to the simple linear oscillator potential 


The linear term is zero since ( ) = 0 at the equilibrium point, and without loss of generality, the 


1 
qn) — Uy =U" dn on = Y Vind; 14. 
U (41,42) --Gn) — Uo = U' (41, 92, qn) loa iF qj Uk 32 Vo (14.37) 


That is 
U” (q1, 42, --dn) = => 3 Viney Uk (14.38) 


where the components of the potential energy tensor V are defined as 
6?U' 
ae ( ) 14.39 
J 09; Odk ð ( ) 
Note that the order of differentiation is unimportant and thus the quantity Vj, is symmetric 
Ve (14.40) 


The motion of the system has been specified for small oscillations around the equilibrium position and 
it has been shown that U’ (q1, q2,...qn) has a minimum value at equilibrium which is taken to be zero for 
convenience. 

In conclusion, equations (14.32) and (14.38) give 


Io, 

T = 32 india (14.41) 
1 n 

(h E 32 Vitis (14.42) 
Js 


where the components of the kinetic energy tensor T and potential energy tensor V are 
Ola a Ola i 
Tjik = Ma 14.43 
w= (Dm poets) aaas 


020" ) 
Vk = ———— 14.44 
jk Can ( ) 
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Note that q, and q, may have different units, but all the terms in the summations for both T and U’, have 
units of energy. The Vj, and Tj, values are evaluated at the equilibrium point, and thus both Vj, and Tj, 
are n x n arrays of values evaluated at the equilibrium location. 


14.6.3 Equations of motion 


Both the kinetic energy and potential energy terms are products of the coordinates leading to a set of 
coupled equations that are complicated to solve. The problem is greatly simplified by selecting a set of 
normal coordinates for which both T and U are diagonal, then the coupling terms disappear. Thus a 
coordinate transformation must be found that simultaneously diagonalizes T;, and V;, in order to obtain a 
set of normal coordinates. 

The kinetic energy T is only a function of generalized velocities q, while the conservative potential energy 
is only a function of the generalized coordinates qk. Thus the Lagrange equations 


= = => 7 (14.45) 
reduce to aU «d aT 
ae Oa (14.46) 
But A 
— = 2 Vind; (14.47) 
and E 
T = 2 Tixd; (14.48) 
Thus the Lagrange equations reduce to the following set of equations of motion, 
XO (Virg + Tinds) = 0 (14.49) 
J 


For each k, where 1 < k < n, there exists a set of n second-order linear homogeneous differential equations 
with constant coefficients. Since the system is oscillatory, it is natural to try a solution of the form 


q(t) = aye (aio) (14.50) 


Assuming that the system is conservative, then this implies that w is real, since an imaginary term for w 
would lead to an exponential damping term. The arbitrary constants are the real amplitude a; and the 
phase 6. Substitution of this trial solution for each k leads to a set of equations 


XO (Vix — w*T yx) a; = 0 (14.51) 
j 
where the common factor e'(“:-ó) has been removed. Equation 14.51 corresponds to a set of n linear 
homogeneous algebraic equations that the a; amplitudes must satisfy for each k. For a non-trivial solution 
to exist, the determinant of the coefficients must vanish, that is 


2 2 2 
Viw Tai Vig-—w*Tig Vig — w T3 

2 2 2 
Vio =w“Tia Vz2—w T2 Vaz — w*To3 


Vis—w?Ti3 Vo3—w?To3 Vz3— w?T33 = (14.52) 


where the symmetry V;, = Vkj has been included. This is the standard eigenvalue problem for which 
the above determinant gives the secular equation or the characteristic equation. It is an equation 
of degree n in w?. The n roots of this equation are w? where w, are the characteristic frequencies or 
eigenfrequencies of the normal modes. 

Substitution of w? into equation 14.52 determines the ratio ai,r : 42,7 : 43,7 : ... : Qn, for this solution 
which defines the components of the n-dimensional eigenvector a,. That is, solution of the secular equations 
have determined the eigenvalues and eigenvectors of the n solutions of the coupled-channel system. 
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14.6.4 Superposition 


The equations of motion >; (Vjxqj + Tjrd;) = 0 are linear equations that satisfy superposition. Thus the 
most general solution q; (t) can be a superposition of the n eigenvectors ajr, that is 


q; (t) = X` ajre) (14.53) 
Only the real part of q; (t) is meaningful, that is, 


qj (1) = Re X` ajre ®t?) = X ajr cos (wrt — ô») (14.54) 


Thus the most general solution of these linear equations involves a sum over the eigenvectors of the 
system which are cosine functions of the corresponding eigenfrequencies. 


14.6.5 Eigenfunction orthonormality 


It can be shown that the eigenvectors are orthogonal. In addition, the above procedure only determines ratios 
of amplitudes, thus there is an indeterminacy that can be used to normalize the a;r. Thus the eigenvectors 
form an orthonormal set. Orthonormality of the eigenfunctions for the rank 3 inertia tensor was illustrated 
in chapter 13.10.2. Similar arguments apply that allow extending orthonormality to higher rank cases such 
that for n-body coupled oscillators. 

The eigenfunction orthogonality for n coupled oscillators can be proved by writing equation 14.51 
for both the s*” root and the r*” root. That is, 


dE Vida = W? 5 Tinks (14.55) 
j E] 

5 Vindjr = 4 5 Tjkajr (14.56) 
g j 


Multiply equation 14.55 by aj, and sum over k. Similarly multiply equation 14.56 by ax; and sum over k. 
These summations lead to 


5 VirAjrOgs = w? 5 TikQjrUks (14.57) 
jk jk 
y VikQjrlgs = we Y Tip airis (14.58) 
jk jk 


Note that the left-hand sides of these two equations are identical. Thus taking the difference between these 
equations gives 
(w? — w?) Y Tjkâjraks =0 (14.59) 
jk 
Note that if C — w?) Æ 0, that is, assuming that the eigenfrequencies are not degenerate, then to ensure 
that equation 14.59 is zero requires that 


ST jr Ajrars =0 r#s (14.60) 
jk 
This shows that the eigenfunctions are orthogonal. If the eigenfrequencies are degenerate, i.e. w? = w?, 
then, with no loss of generality, the axes r and s can be chosen to be orthogonal. 
The eigenfunction normalization can be chosen freely since only ratios of the eigenfunction compo- 
nents aj, are determined when w, is used in equation 14.51. The kinetic energy, given by equation 14.32 
must be positive, or zero for the case of a static system. That is 


fé oe 
T=5 2 Tiedt >0 (14.61) 
F 
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Use the time derivative of equation 14.54 to determine q, and insert into equation 14.61 gives that the kinetic 
energy is 


1 1< 
T=> DN Tirdidn = 5 X Tix Y WrWwsajr COS (wrt — 5) Oks cos (wet — ôs) (14.62) 
j,k j,k r,s 
For the diagonal term r = s 
T=5 Do Tjrdjdr =|5 S w2 cos” (wrt — by) Do Ti agriter >0 (14.63) 
j,k r j,k 
Since the term in the square brackets must be positive, then 
Y Tjkajragr > 0 (14.64) 
j,k 


Since this sum must be a positive number, and the magnitude of the amplitudes can be chosen freely, then 
it is possible to normalize the eigenfunction amplitudes to unity. That is, choose that 


X Tjkajrags = 1 (14.65) 
j,k 


The orthogonality equation, 14.60 and the normalization equation 14.65 can be combined into a single 
orthonormalization equation 


S Tjkajraks = Ors (14.66) 
j,k 
This has shown that the eigenvectors form an orthonormal set. 
Since the j*” component of the r*” eigenvector is ajr, then the r*” eigenvector can be written in the form 


a, = X ajré; (14.67) 
j 
where €; are the unit vectors for the generalized coordinates. 


14.6.6 Normal coordinates 


The above general solution of the coupled-oscillator problem is best expressed in terms of the normal coor- 
dinates which are independent. It is more transparent if the superposition of the normal modes are written 
in the form 


GIS > Bayer (14.68) 


where the complex factor $5, includes the arbitrary scale factor to allow for arbitrary amplitudes q; as well 
as the fact that the amplitudes ajy have been normalized and the phase factor 6, has been chosen. 
Define 


n, (t) = Bert (14.69) 
then equation 14.68 can be written as 
q; (t) = X ajn, (t) (14.70) 
Equation 14.70 can be expressed schematically as the matrix multiplication 
q={a}-7 (14.71) 
The n, (t) are the normal coordinates which can be expressed in the form 
n={a} q (14.72) 


Each normal mode n, corresponds to a single eigenfrequency, w, which satisfies the linear oscillator equation 


i, + wn, =0 (14.73) 
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14.7 'Two-body coupled oscillator systems 


The two-body coupled oscillator is the simplest coupled-oscillator system that illustrates the general fea- 
tures of coupled oscillators. The following four examples involve parallel and series couplings of two linear 
oscillators or two plane pendula. 


14.2 Example: Two coupled linear oscillators 


The coupled double-oscillator problem, figure 14.1 discussed in chapter 14.2, can be used to demonstrate 
that the general analytic theory gives the same solution as obtained by direct solution of the equations of 
motion in chapter 14.2. 

1) The first stage is to determine the potential and kinetic energies using an appropriate set of generalized 
coordinates, which here are xı and x2. The potential energy is 


1 1 1 
U = ikr? + EKT? + =n! (£2 — 01)? = = (K +K’) £? += (K+K) 221/2122 
2 2 2 2 2 
while the kinetic energy is given by 
1 1 
T= sini + sini 
2) The second stage is to evaluate the potential energy V and kinetic energy T tensors. The potential 


energy tensor V is nondiagonal since V;x gives 


oU 
V; = = / — 
11 el K+K Vaz 
oU 
V, = = j = 
12 (son), K Var 


That is, the potential energy tensor V is 


Similarly, the kinetic energy is given by 


de I; 1 set 
j,k 


Since Ty, = To2 = M and Ty. = Toi = 0 then the kinetic energy tensor T is 


m 0 
Te] 
Note that for this case, the kinetic energy tensor T equals the mass tensor, which is diagonal, whereas the 
potential energy tensor equals the spring constant tensor, which is nondiagonal. 


3) The third stage is to use the potential energy V and kinetic energy T tensors to evaluate the secular 
determinant using equations 14.52 
K+! — mu? —k! 


=K" k +K’ — mw 


The expansion of this secular determinant yields 
(«+k —mw?)” —K7=0 


That is 
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Solving for w, gives 


REFEREER. 
Wp = 
m 
The solutions are 
K+ 2k! 
Wy = 
m 


K 
wa = == 
m 


which is the same as derived previously, (equations 14.7 — 9). 
4) The fourth step is to insert either one of these eigenfrequencies into the secular equation 
Y (Vix — wp Tin) ajr = 0 (a) 
J 

Consider the secular equation a for k = 1 

(k p= w2M) Gir — K'ar = 0 
Then for the first eigenfrequency w 1, that is, k =1,r=1 

(+! — K — 2k’) a11 — K'a = 0 
which simplifies to 

Ajr = 0411 = —021 


Similarly, for the other eigenfrequency wa, that is, k = 1,r = 2 
(k + K! —= K) aji2 — Kk! a22 =0 


which simplifies to 
Qjr = 412 = 022 


5) The final stage is to write the general coordinates in terms of the normal coordinates y, (t) = 
b,et. Thus 
11 = 41181 + Q12N)2 = A117, + a222 


and 
T2 = 0211 + A22N2 = —011N)] + a222 


Adding or subtracting gives that the normal modes are 


1 
tm = Ga 
l (xy +21) 
= =— x 
12 aos v2 1 


Thus the symmetric normal mode nz corresponds to an oscillation of the center-of-mass with the lower 
frequency w2 = y. This frequency is the same as for one single mass on a spring of spring constant 
k which is as expected since they vibrate in unison and thus the coupling spring force does not act. The 


antisymmetric mode yn, has the higher frequency w1 = 4/ stan since the restoring force includes both the 
main spring plus the coupling spring. 

The above example illustrates that the general analytic theory for coupled linear oscillators gives the 
same answer as obtained in chapter 14.2 using Newton’s equations of motion. However, the general analytic 
theory is a more powerful technique for solving complicated coupled oscillator systems. Thus the general 
analytic theory will be used for solving all the following coupled oscillator problems. 
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14.3 Example: Two equal masses series-coupled by two equal springs 


Consider the series-coupled system shown in the figure. 

1) The first stage is to determine the potential and kinetic 
energies using an appropriate set of generalized coordinates, 
which here are xı and x2. The potential energy is 


1 1 1 
U = shi + z“ (ty — 21) = Ka? + she — KT1T2 


Two equal masses series-coupled by two 
equal springs. 


while the kinetic energy is given by 
1 1 
T= ¿más + ¿miz 


2) The second stage is to evaluate the potential energy V and mass T tensors. The potential energy tensor 
V is nondiagonal since Vj, gives 


0%U 

“i ES de 
0%U 

Via ES ==k = Va 
0%U 

Va ES ğ 


That is, the potential energy tensor V is 


Similarly, since the kinetic energy is given by 


Lox ur 1 dni 
To= zmt + ¿miz =e So minds de 
j,k 


then Tii = To2 = mM and Ty. = To; = 0. Thus the kinetic energy tensor T is 


m 0 
ES, 
Note that for this case the kinetic energy tensor is diagonal whereas the potential energy tensor is nondiagonal. 


3) The third stage is to use the potential energy V and kinetic energy T' tensors to evaluate the secular 


determinant using equation 14.52 


| 2k — mw? —k 


—K K — Mw 


The expansion of this secular determinant yields 


(2k — mw?) (k — mw?) — k? =0 


That is a 
wt — 3—w? + ae =0 
m 
The solutions are 
vV5+1 [k v5-1 [k 
Wi = = wa = — 
2 m 2 m 
4) The fourth step is to insert these eigenfrequencies into the secular equation 14.51 


S (Vix — 07 Tx) Agr = 0 


j 
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Consider k = 1 in the above equation 
(2% — ws M) Qir — Kdoar = 0 
Then for eigenfrequency wi, that is, k=1,r=1 


v5-1 
2 


0411 = 021 


Similarly, for k = 1,r = 2 
v5+1 


5) 012 = 022 

5) The final stage is to write the general coordinates in terms of the normal coordinates n, (t) = 

B erurt 
Es : 
Thus 
+ fe 
ty =a a =a ——= 
1 1191 1272 1191 V+ e 

and 


v5-1 
LQ = 4217, + 4222 = — 2 01181 + a222 


Adding or subtracting gives that the normal modes are 


2  (v5-1 
at 7 auv5 ms 2 a 
1 V5+1 


Thus the symmetric normal mode has the lower frequency w2 = vet. /~. The antisymmetric mode has the 


frequency wi = SH /E since both springs provide the restoring force. This case is interesting in that for 
both normal modes, the amplitudes for the motion of the two masses are different. 


14.4 Example: Two parallel-coupled plane pendula 


Consider the coupled double pendulum system shown in 
the adjacent figure, which comprises two parallel plane pen- 
dula weakly coupled by a spring. The angles 01 and 02 are VILLI 
chosen to be the generalized coordinates and the potential en- 
ergy is chosen to be zero at equilibrium. Then the kinetic 
energy is 

1 -\2 1 . \2 
T = 5m(db1) + 5m (062) 

As discussed in chapter 3, it is necessary to make the small- 
angle approximation in order to make the equations of motion 
for the simple pendulum linear and solvable analytically. That 
is, 


Two parallel-coupled plane pendula. 


1 
U = mgb(1-— cos) + mgb (1 — cos 62) + z" (bsin 01 — bsin 02)” 


mgb 
2 


kb? 
(9 +02) + > (1 — 02)” 


assuming the small angle approximation sin@ = 0 and (1 — cos 01) = E 


The second stage is to evaluate the kinetic energy T and potential energy V tensors 


o fm? 0 _ | mgb+ Kb? —Kb? 
ae { 0 mb? \ v= { —Kb? mgb + Kb? 
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Note that for this case the kinetic energy tensor is diagonal whereas the potential energy tensor is nondiagonal. 
The third stage is to evaluate the secular determinant 


mgb + Kb? — w2mb? —Kb? 


— Kb? mgb + Kb? — w? mb? mad 


which gives the characteristic equation 


(mgb + kb? — w?mb?)” = (xb)? 


or 
mg + Kb — w?mb = £Kb 
The two solutions are 5 
2_ 9 2_9 K 
109) = — Ws = = + — 
ay b 2b m 


The fourth step is to insert these eigenfrequencies into equation 14.51 
n 
Y (Vir — wT jx) ajr = 0 
j 


Consider k = 1 
(mgb + Kb? — wimb?) air — kb? ao, = 0 


Then for the first eigenfrequency, w1, the subscripts are k =1,r =1 
(mgb + nb? — amb”) Gi ea a 


which simplifies to 
011 = 021 
Similarly, for k = 1,r = 2 


2 
(ra + kb? — (y + =) mi?) ars — Kb? a2 = 0 


which simplifies to 
012 = —022 


The final stage is to write the general coordinates in terms of the normal coordinates 


01 = 4118) + 0412)2 = 0111), — a222 


and 
02 = 4218), + a222 = 4111), + a222 


Adding or subtracting these equations gives that the normal modes are 


1 1 
n T 2) Ma zan E 1) 
As for the case of the double oscillator discussed in example 14.2, the symmetric normal mode corresponds 
to an oscillation of the center-of-mass, with zero relative motion of the two pendula, which has the lower 
frequency wı = y E. This frequency is the same as for one independent pendulum as expected since they 


vibrate in unison and thus the only restoring force is gravity. The antisymmetric mode corresponds to 


relative motion of the two pendula with stationary center-of-mass and has the frequency wa = (2 + 25) 
since the restoring force includes both the coupling spring and gravity. 

This example introduces the role of degeneracy which occurs in this system if the coupling of the pendula 
is zero, that is, k = 0, leading to both frequencies being equal, i.e. wy = w2 = JZ. When K = 0, then both 
{T} and {V} are diagonal and thus in the (01,02) space the two pendula are independent normal modes. 
However, the symmetric and asymmetric normal modes, as derived above, are equally good normal modes. 
In fact, since the modes are degenerate, any linear combination of the motion of the independent pendula are 
equally good normal modes and thus one can use any set of orthogonal normal modes to describe the motion. 
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14.5 Example: The series-coupled double plane pendula 


The double-pendula system comprises one plane pendulum attached 
to the end of another plane pendulum both oscillating in the same plane. 
The kinetic and potential energies for this system are given in example 
6.21 to be 


1 29 a oss 1 2 
T = 5(m + Mz) Lio, + MzL1 L299 cos(p, — p2) + ¿MaLzós l 
U = (mı +mə)gLı(l — cos 1) + magLa(1 — cos 3) L$ 


a) Small-amplitude linear regime Libs 
Use of the small-angle approximation makes this system linear and 


solvable analytically. That is, T and U become 


m>, 


Two series-coupled plane pendula. 


1 
U = (mı + ma)gLi9i + ¿MagLoo) 


T = 


NLR NI = 


.2 K i 1 .2 
(mı + m2) L739) + maL1Lo0, 02 + ¿MaLzó) 
Thus the kinetic energy and potential energy tensors are 


pas (m+ma)Li mL vas (m+me)gli 0 
E Mə Lı Lo mL E 0 ma9La 


Note that T is nondiagonal, whereas V is diagonal which is opposite 
to the case of the two parallel-coupled plane pendula. 

The solution of this case is simpler if it is assumed that Ly = Ly = L 
and mı =M =m. Then 


NE 2 2 = 2 Que 0 
some?) vane {ah 2 \ 


where wo = Jz which is the frequency of a single pendulum. da- V2 
The next stage is to evaluate the secular determinant p2 í 


2(w2 — w?) —w 
2 


mL? 
—w (w — w?) 


E 
The eigenvalues are 


w? = (2— V2)? we = (2+ 4Du? 


r= Us 
As shown in the adjacent figure, the normal modes for this system 
are 


1 Da 1 do Normal modes for two 
E (61 + Ya) 12 (%1 = a series-coupled plane pendula. 


The second mass has a \/2 larger amplitude that is in phase for solution 1 and out of phase for solution 2. 
b) Large amplitude chaotic regime 
Stachowiak and Okada [Sta05] used computer simulations to numerically analyze the behavior of this 
system with increase in the oscillation amplitudes. Poincaré sections, bifurcation diagrams, and Lyapunov 
exponents all confirm that this system evolves from regular normal-mode oscillatory behavior in the linear 
regime at low energy, to chaotic behavior at high excitation energies where non-linearity dominates. This 
behavior is analogous to that of the driven, linearly-damped, harmonic pendulum described in chapter 3.5 


UN = Dass 
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14.8 Three-body coupled linear oscillator systems 


Chapter 14.7 discussed parallel and series arrangements of two coupled oscillators. Extending from two to 
three coupled linear oscillators introduces interesting new characteristics of coupled oscillator systems. For 
more than two coupled oscillators, coupled oscillator systems separate into two classifications depending on 
whether each oscillator is coupled to the remaining n — 1 oscillators, or when the coupling is only to the 
nearest neighbors as illustrated below. 


14.6 Example: Three plane pendula; mean-field linear coupling 


Consider three identical pendula with mass m and length 
b, suspended from a common support that yields slightly to 
pendulum motion leading to a coupling between all three pen- 
dula as illustrated in the adjacent figure. Assume that the 
motion of the three pendula all are in the same plane. This 
case is analogous to the piano where three strings in the tre- 
ble section are coupled by the slightly-yielding common bridge 
plus sounding board leading to coupling between each of the 
three coupled oscillators. This case illustrates the important 
concept of degeneracy. 

The generalized coordinates are the angles 01,02, and 03. 
Assume that the support yields such that the actual deflection 
angle for pendulum 1 is 


Three plane pendula with complete linear 
coupling. 


€ 
01 = 01 — 5 (02 + 03) 


where the coupling coefficient e is small and involves all the pendula, not just the nearest neighbors. Assume 
that the same coupling relation exists for the other angle coordinates. The gravitational potential energy of 
each pendulum is given by 


1 
Ui = mgb(1 — cos 01) = 790, 
assuming the small angle approximation. Ignoring terms of order e? gives that the potential energy 


mgb 


b 
U = (0P +02 +02) = TD (0% +03 +03 — 260102 — 250109 — 250203) 


The kinetic energy evaluated at the equilibrium location is 
1 Ree A ee | ne 
T = 51 (001) + 5m (b2) + 5m (03) 
The next stage is to evaluate the {T} and {V} tensors 


1 0 0 
T=mb< 0 1 0 V=mgb4 -e 1 -e 
0 0 1 


1— EL? —e =g 
mgb| —e 1- ża? —e =0 
—e e 1 — tw? 


Expanding and factoring gives 
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The roots are 
w= LU TE w= [LV TFS ws = [SVT 


This case results in two degenerate eigenfrequencies, wı = w2 while w3 is the lowest eigenfrequency. 
The eigenvectors can be determined by substitution of the eigenfrequencies into 


S (Vik — wr T jr) ajr = 0 
j 


Consider the lowest eigenfrequency ws, i.e. r = 3, for k = 1, and substitute for w3 = VEA — le gives 
22413 — €293 — €033 = 0 


while for r = 3,k = 2 
—e013 + 28493 — €a33 = 0 


Solving these gives 
013 = 423 = 433 


Assuming that the eigenfunction is normalized to unity 
ais + a33 + a33 = 1 


then for the third eigenvector a3 
1 
013 = 423 = Q33 = Te 
V3 
This solution corresponds to all three pendula oscillating in phase with the same amplitude, that is, a coherent 
oscillation. 

Derivation of the eigenfunctions for the other two eigenfrequencies is complicated because of the degen- 
eracy W, = wa, there are only five independent equations to specify the six unknowns for the eigenvectors 
a, and ag. That is, the eigenvectors can be chosen freely as long as the orthogonality and normalization are 
satisfied. For example, setting a3, = 0, to remove the indeterminacy, results in the a matrix 


312  ¿v6 3v3 
()=4 312 ive 1y3 


0 -3v6 3v3 
and thus the solution is given by 
0, ¿v2 4v6 4v3 ™ 
0 S=% 175 lye 143 dd m 
03 0 -tV6 ¿v3 ns 


The normal modes are obtained by taking the inverse matrix {a}~' and using {n} = {a} t {0}. Note 
that since {a} is real and orthogonal, then {a}~' equals the transpose of {a}. That is; 


UN 4 2 -3 2 0 0, 
m p=} iv vO -1v5 pd 0 


The normal mode nz has eigenfrequency 
w3 = pe 1 — 2e 


and eigenvector 
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This corresponds to the in-phase oscillation of all three pendula. 
The other two degenerate solutions are 


1 


1 
UN V2 === 


01,—02,0 
(91 2,0) Na Ve 


(01,02, —203) 


with eigenvalues 
Wy = W2 = yiv 


These two degenerate normal modes correspond to two pendula oscillating out of phase with the same ampli- 
tude, or two oscillating in phase with the same amplitude and the third out of phase with twice the amplitude. 
An important result of this toy model is that the most symmetric mode ng is pushed far from all the other 
modes. Note that for this example, the coherent mode ag corresponds to the center-of-mass oscillation with 
no relative motion between the three pendula. This is in contrast to the eigenvectors a, and ag which both 
correspond to relative motion of the pendula such that there is zero center-of-mass motion. This mean-field 
coupling behavior is exhibited by collective motion in nuclei as discussed in example 14.12. 


14.7 Example: Three plane pendula; nearest-neighbor coupling 


There is a large and important class of coupled oscillators 
where the coupling is only between nearest neighbors; a crys- 
talline lattice is a classic example. A toy model for such a 
system is the case of three identical pendula coupled by two 
identical springs, where only the nearest neighbors are cou- 
pled as shown in the adjacent figure. Assume the identical 
pendula are of length b and mass m. As in the last example, 
the kinetic energy evaluated at the equilibrium location is 


1 2 1 2 1 -2 
T= ¿mo F zm 0 F ¿mov Three plane pendula with nearest-neighbour 
coupling. 
The gravitational potential energy of each pendulum equals 
mgb(1 — cos 0) ~ Emgb0” thus 
1 
Ugrav = ¿Mab(ó + 05 + 03) 
while the potential energy in the springs is given by 
1 1 
Uspring oad ao [(02 + 01) F (93 = 62)°| NN se [0% + 207 + 03 = 20103 c= 20203] 
Thus the total potential energy is given by 
1 1 
U = gin gb(6; + 03 +03) + afb [07 + 205 + 03 — 20102 — 20205] 


The Lagrangian then becomes 


Ñ= smb? (o; 4565 + 0) = ; (mgb + Kb?) 6? + i (mgb + 2:02) 63 + ; (mgb + Kb?) 03 — nb? (0102 + 0203) 


Using this in the Euler-Lagrange equations gives the equations of motion 


mb, — (mgb + b2)01 + Kb?02 = 0 
mb?0 — (mgb + 2Kb7)02 + Kb? (0, +03) = 0 
mb?63 — (mgb + Kb?)03 + 6b702 = 0 


The general analytic approach requires the T and V energy tensors given by 


1 0 0 mgb + Kb? —Kb? 0 
T=mb* 0 1 0 V= —Kb? mgb + 2xb? —kb? 
0 0 1 0 —Hb? mgb + Kb? 
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Note that in contrast to the prior case of three fully-coupled pendula, for the nearest neighbor case the potential 
energy tensor {V} is non-zero only on the diagonal and +1 components parallel to the diagonal. 
The third stage is to evaluate the secular determinant of the (v — wT) matrix, that is 


mgb + Kb? — w?mb? —kb? 0 
—kb? mgb + 2xb? — w? mb? — hb? =0 
0 —kb? mgb + Kb? — w2mb? 


This results in the characteristic equation 
(mgb = w?’ mb?) (mgb + kb? — w?’ mb?) (mgb + 3xb? — w?’ mb?) =0 


which results in the three non-degenerate eigenfrequencies for the normal modes. 
The normal modes are similar to the prior case of complete linear 
coupling, as shown in the adjacent figure. 
wi = /% This lowest mode 1, involves the three pendula oscillating n 
in phase such that the springs are not stretched or compressed thus the 1 
period of this coherent oscillation is the same as an independent pendulum 
of mass m and length b. That is 


1 
n = V3 
we = \/¢ +4. This second mode n has the central mass stationary with dy p=, P3s=¢1 


the outer pendula oscillating with the same amplitude and out of phase. 
That is 


(91, 02,03) 


1 
n = — (41,0, —03) 
2 V2 E 3 n 


w3 =4/7 + de . This third mode 13 involves the outer pendula in phase 


with the same amplitude while the central pendulum oscillating with angle 
03 = —20,. That is 
1 
= — (01, —202,0 
ig Je | i da, 3) A ha 


Similar to the prior case of three completely-coupled pendula, the coherent Dr 20 d3=4 
normal mode ny, corresponds to an oscillation of the center-of-mass with 
no relative motion, while y, and nz correspond to relative motion of 
the pendula with stationary center of mass motion. In contrast to the n 
prior example of complete coupling, for nearest neighbor coupling the two 3 
higher lying solutions are not degenerate. That is, the nearest neighbor 
coupling solutions differ from when all masses are linearly coupled. 
It is interesting to note that this example combines two coupling mech- 
anisms that can be used to predict the solutions for two extreme cases 
by switching off one of these coupling mechanisms. Switching off the => W= -7 
coupling springs, by setting k = 0, makes all three normal frequencies Dr do=-26, p3=¢ı 


degenerate with wy = w2 = w3 = y E. This corresponds to three inde- Normal modes of three plane 


pendent identical pendula each with frequency w = \/¥%. Also the three pendula with nearest-neighbour 
linear combinations nı, No, Ng also have this same frequency, in particular coupling. 

7, corresponds to an in-phase oscillation of the three pendula. The three 

uncoupled pendula are independent and any combination the three modes is allowed since the three frequencies 
are degenerate. 

The other extreme is to let ¢ = 0, that is switch off the gravitational field or let b — oo, then the only 
coupling is due to the two springs. This results in wı = 0 because there is no restoring force acting on the 
coherent motion of the three in-phase coupled oscillators; as a result, oscillatory motion cannot be sustained 
since it corresponds to the center of mass oscillation with no external forces acting which is spurious. That 
is, this spurious solution corresponds to constant linear translation. 
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14.8 Example: System of three bodies coupled by six springs 


Consider the completely-coupled mechanical system shown in the adjacent figure. 

1) The first stage is to determine the potential and kinetic energies using an appropriate set of 
generalized coordinates, which here are xı and x2. The potential energy is the sum of the potential energies 
for each of the six springs 


3 3 
U= yn + E + E KI1L9 — KL1L3 — KI2L3 
while the kinetic energy is given by 
1 1 
T = mi? + =mi? + más 


2 2 2 
2) The second stage is to evaluate the potential energy V and 
kinetic energy T' tensors. 


3k -k -K M 0 0 
V= —k 3k -R T= 0 M 0 
-k -k 3k 0 0 M 


Note that for this case the kinetic energy tensor is diagonal whereas 
the potential energy tensor is nondiagonal and corresponds to com- 
plete coupling of the three coordinates. 

3) The third stage is to use the potential V and kinetic T 
energy tensors to evaluate the secular determinant giving 


(3K = mw?) —K —K 
—K (34 = mw?) —K =0 
=k =k (3 = mw?) System of three bodies coupled by six 


The expansion of this secular determinant yields Spree 


(K — mw?) (4k — mw?) (4K — mMw”) =0 


The solution for this complete-coupled system has two degenerate eigenvalues. 


K K 
Wy = w = 2 g w3 = al 
m m 


4) The fourth step is to insert these eigenfrequencies into the secular equation 


Yo (Vir — oF Tix) ajr =0 
j 
to determine the coefficients ajr. 
5) The final stage is to write the general coordinates in terms of the normal coordinates. 
The result is that the angular frequency w3 = yE corresponds to a normal mode for which the three 
masses oscillate in phase corresponding to a center-of-mass oscillation with no relative motion of the masses. 


13 = Ze + 22 + 23) 

For this coherent motion only one spring per mass is stretched resulting in the same frequency as one 
mass on a spring. The other two solutions correspond to the three masses oscillating out of phase which 
implies all three springs are stretched and thus the angular frequency is higher. Since the two eigenvalues 
Wy = w2 = 2/E are degenerate then there are only five independent equations to specify the six unknowns 
for the degenerate eigenvalues. Thus it is possible to select a combination of the eigenvectors n, and ny such 
that the combination is orthogonal to nz. Choose azı = 0 to removes the indeterminacy. Then adding or 
subtracting gives that the normal modes are 


1 1 
My E May i a ee) 
These two degenerate normal modes correspond to relative motion of the masses with stationary center-of- 
mass. 
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14.9 Molecular coupled oscillator systems 


There are many examples of coupled oscillations in atomic and molecular physics most of which involve 
nearest-neighbor coupling. The following two examples are for molecular coupled oscillators. The triatomic 
molecule is a typical linearly-coupled molecular oscillator. The benzene molecule is an elementary example 
of a ring structure coupled oscillator. 


14.9 Example: Linear triatomic molecular CO» 


Molecules provide excellent examples of vibrational modes involving nearest neighbor coupling. Depending 
on the atomic structure, triatomic molecules can be either linear, like CO2, or bent like water, H20 which 
has a bend angle of 0 = 109°. A molecule with n atoms has 3n degrees of freedom. There are three degrees 
of freedom for translation and three degrees of freedom for rotation leaving 3n — 6 degrees of freedom for 
vibrations. A triatomic molecule has three vibrational modes, two longitudinal and one transverse. Consider 
the normal modes for vibration of the linear molecule COz 


Longitudinal modes 


The coordinate system used is illustrated in the adjacent figure. 
The Lagrangian for this system is 


m. M. m. K 


Evaluating the kinetic energy tensor gives 


while the potential energy tensor gives 


1 -1 0 
V=xn3 -1 2 -1 
0 -1 1 
The secular equation becomes 
(—mw? + K) —K 0 
—k (—Mw? + 2K) —k =0 
0 =K (—mw? + K) 


Note that the same answer is obtained using Newtonian mechanics. That is, the force equation gives 


má — k (z2 — z1) = 0 
Më +s (x2 — £1)— K (£3 — £2) = 0 
M3 — k (x3 — z2) = 0 
Let the solution be of the form l 
zj = ase" j=1,2,3 


Substitute this solution gives 
(—mw? + K) 41 =Kdg = 
—Ka, + (—Mw* +2x) a2 — kaz = 
—Kag + (mw? + K) aa = 


This leads to the same secular determinant as given above with the matrix elements clustered along the 
diagonal for nearest-neighbor problems. 
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Expanding the determinant and collecting terms 
yields 


w? (mu? + K) (—mMw? +kM + 2m) =0 


Equating either of the three factors to zero gives 


2 O O = 


Wa = == 


K  2k 
= = (RH) 

The solutions are: 

1) wı =0; This solution gives nı = a{1,1,1}. This 
mode is not an oscillation at all, but is a pure transla- 
tion of the system as a whole as shown in the adjacent ý 
figure. There is no change in the restoring forces since 
the system moves such as not to change the length of the 
springs, that is, they stay in their equilibrium positions. 
This motion corresponds to a spurious oscillation of the center of mass that results from referencing the 
three atom locations with respect to some fixed reference point. This reference point should have been chosen 
as the center of mass since the motion of the center-of-mass already has been taken into account separately. 
Spurious center of mass oscillations occur any time that the reference point is not at the center of mass for 
an isolated system with no external forces acting. 

2) we =yE : This solution corresponds to ny = a{1,0,—1} and is shown in the adjacent figure. The 
central mass M remains stationary while the two end masses vibrate longitudinally in opposite directions 
with the same amplitude. This mode has a stationary center of mass. For CO2 the electrical geometry is 
O-Ctt+O-. Mode 2 for CO2 does not radiate electromagnetically because the center of charge is stationary 
with respect to the center of mass, that is, the electric dipole moment is constant. 


n, 
n, 
n, +O ©O-<+O 
th 


Normal modes of a linear triatomic molecule 


3) w3 = (£ + 2e) : This solution corresponds to ng = a{1,—2 (2) ehh As shown in the adjacent 
figure, this motion corresponds to the two end masses vibrating in unison while the central mass vibrates 
oppositely with a different amplitude such that the center-of-mass is stationary. This CO2 mode does radiate 
electromagnetically since it corresponds to an oscillating electric dipole. 

It is interesting to note that the ratio = = 1.915 for CO, and the ratio of the two modes is independent 


of the potential energy tensor V. That is 


Transverse modes 


The solutions are: 


4) wa = 4/2 (27%) £. This is the only non-spurious transverse mode 19, which corresponds to the two 
outside masses vibrating in unison transverse to the symmetry axis while the central mass vibrates oppositely. 
This mode radiates electric dipole radiation since the electric dipole is oscillating. 

5) w5 =0. This transverse solution ns has all three nuclei vibrating in unison transverse to the symmetry 
axis and corresponds to a spurious center of mass oscillation. 

6) we = 0. This transverse solution ng corresponds to a stationary central mass with the two outside 
masses vibrating oppositely. This corresponds to a rotational oscillation of the molecule which is spurious 
since there are no torques acting on the molecule for a central force. Rotational motion usually is taken into 
account separately. 

The normal modes for the bent triatomic molecule are similar except that the oscillator coupling strength 
is reduced by the factor cos@ where @ is the bend angle. 
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14.10 Example: Benzene ring 


The benzene ring comprises six carbon atoms bound in a plane hexagonal ring. A classical analog of the 
benzene ring comprises 6 identical masses m on a frictionless ring bound by 6 identical springs with linear 
spring constant K, as illustrated in the adjacent figure. Consider only the in-plane motion, then the kinetic 
energy is given by 


The potential energy equals 
6 6 
1 
U = ¿Er X Ona = 0, = Kr? So: 0107 0203 0304 040% 0506 0601 


i=l i=1 


where i =7=1. Thus the kinetic energy and potential energy tensors are given by 


100000 2 -1 0 0 0 -1 
010 0 0 0 -1 2 -1 0 0 0 
0010 0 0 0 -l 2 -1 0 0 
52,9559 Z 
PE E D S ig. e e E NG 
000 0 1 0 0 0 0 -1 2 -1 
000001 1 0 0 0 -1 2 
This nearest-neighbor system includes non-zero (n,1) and (1, n) elements due to the ring structure. Define 
y= mo — 2 then the solution of the set of linear homogeneous equations requires that 
xz 1 0 0 0 1 
1 «i100 0 
0 1210 0 -o 
001x101. 
00.011 
10001 <2 
that is 


(x — 2) (aw — 1)? (x +1)? (& + 2) =0 
The eigenvalues and eigenfunctions are given in the table 


Classical analog of a benzene molecular ring. 
Normal modes 

01 —02+03—04+05—06 
0405-01405 
01+02—04+05 

01—03—04+06 
—-01—02+04+05 

01+02+03+04+05+06 


Note the following properties of the normal modes and their frequencies. 

n = 1: Adjacent masses vibrate 180° out of phase, thus each spring has maximal compression or extension, 
leading to the energy of this normal mode being the highest. 

n = 2,3: These two solutions are degenerate and correspond to two pairs of masses vibrating out of phase 
while the third pair of masses are stationary. Thus the energy of this normal mode is slightly lower than the 
n = 1 normal mode. Any combination of these degenerate normal modes are equally good solutions. 

n = 4,5: From the figure it can be seen that both of these solutions correspond to a center of mass 
oscillation and thus these modes are spurious. 

n = 6: This vibrational mode has zero energy corresponding to zero restoring force and all six masses 
moving uniformly in the same direction. This mode corresponds to the rotation of the benzene molecule about 
the symmetry axis of the ring which usually is taken into account assuming a separate rotational component. 

This classical analog of the benzene molecule is interesting because it simultaneously exhibits degenerate 
normal modes, spurious center of mass oscillation, and a rotational mode. 
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14.10 Discrete Lattice Chain 


Crystalline lattices and linear molecules are important classes of coupled oscillator systems where nearest 
neighbor interactions dominate. A crystalline lattice comprises thousands of coupled oscillators in a three- 
dimensional matrix with atomic spacing of a few 107!%m. Even though a full description of the dynamics of 
crystalline lattices demands a quantal treatment, a classical treatment is of interest since classical mechanics 
underlies many features of the motion of atoms in a crystalline lattice. The linear discrete lattice chain is 
the simplest example of many-body coupled oscillator systems that can illuminate the physics underlying a 
range of interesting phenomena in solid-state physics. As illustrated in example 2.7, the linear approxima- 
tion usually is applicable for small-amplitude displacements of nearest-neighbor interacting systems which 
greatly simplifies treatment of the lattice chain. The linear discrete lattice chain involves three independent 
polarization modes, one longitudinal mode, plus two perpendicular transverse modes. The 3n degrees of 
freedom for the n atoms, on a discrete linear lattice chain, are partitioned with n degrees of freedom for each 
of the three polarization modes. These three polarization modes each have n normal modes, or n travelling 
waves, and exhibit quantization, dispersion, and can have a complex wave number. 


14.10.1 Longitudinal motion 


The equations of motion for longitudinal modes of the lattice chain can be derived by considering a linear 
chain of n identical masses, of mass m, separated by a uniform spacing d as shown in Fig 14.7. Assume 
that the n masses are coupled by n + 1 springs, with spring constant «x, where both ends of the chain are 
fixed, that is, the displacements go = qn+1 = 0 and velocities do = Gn41 = 0. The force required to stretch a 
length d of the chain a longitudinal displacements, q; for mass j, is Fj = kqj. Thus the potential energy for 
stretching the spring for segment (q,-1 — q;) is Uj = $(qj-1 — qj). The total potential and kinetic energies 
are 


i n+1 
U= 2 (qj-1 Ti qi) (14.74) 
j=l 
IL, 
T= ¿md 4 (14.75) 


Since qn+1 = 0 the kinetic energy and Lagrangian can be 
extended to 7 = n+1, that is, the Lagrangian can be written 
as 


n+1 
m= > 2, (mi -s (q; — a)”) (14.76) 


Using this Lagrangian in the Lagrange-Euler equations 
gives the following second-order equation of motion for lon- 
gitudinal oscillations 


A 2 
Wy =e boas A Fab) aoa Figure 14.7: Portion of a lattice chain of iden- 
where j = 1,2,..... and where tical masses m connected by identical springs 
of spring constant K. The displacement of the 
wo =4/— (14.78) jt” mass from the equilibrium position is q; 

mm assumed to be positive to the right. 


14.10.2 ‘Transverse motion 


The equations of motion for transverse motion on a linear discrete lattice chain, illustrated in figure 14.8, 
can be derived by considering the displacements q; of the it mass for n identical masses, with mass m, 
separated by equal spacings d and assuming that the tension in the string is T = (22). Assuming that the 
transverse deflections qj are small, then the j — 1 to j spring is stretched to a length 


d = yÈ + (qa) (14.79) 
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Thus the incremental stretching is 
(a — qi-1) 
dd ~v = 14.80 
2d (14.80) 
The work done against the tension 7 is T- dd per segment. Thus the 
total potential energy is 


n+1 
T 2 
U= Ga) (1481) 
where qo and q, +1 are identically zero. A Je TN 
The kinetic energy is d d d 
ee 
T= 3m a (14.82) 


Since qn+1 = 0, the kinetic energy and Lagrangian summations can 
be extended to 7 = n+ 1, that is 
yeh 


2 Figure 14.8: Transverse motion of a 
14.83 
=3 z2 (m (mai q > 1— 4%) ) ( ) linear discrete lattice chain 


Using this Lagrangian in the Lagrange Euler equations gives the following second-order equation of motion 
for transverse oscillations 


qj = = Wo 2 (qj- D 2qj + qj+1) (14.84) 
where j = 1,2,....n and 
= 
a IER 14.85 
Es Ae (14.85) 


The normal modes for the transverse modes comprise standing waves that satisfy the same boundary 
conditions as for the longitudinal modes. The n equations of motion for longitudinal motion, equation 
14.77, or transverse motion, equation 14.84, are identical in form. The major difference is that wo for the 
transverse normal modes wọ = VE differs from that for the longitudinal modes which is w, = VE . Thus 
the following discussion of the normal modes on a discrete lattice chain is identical in form for both transverse 
and longitudinal waves. 


14.10.3 Normal modes 


The normal modes of the n equations of motion on the discrete lattice chain, are either longitudinal or 
transverse standing waves that satisfy the boundary conditions at the extreme ends of the lattice chain. 
The solutions can be given by assuming that the n identical masses on the chain oscillate with a common 
frequency w. Then the displacement amplitude for the jt” mass can be written in the form 


qj(t) = aje (14.86) 


where the amplitude a; can be complex. Substitution into the preceding n equations of motion, 14.77, 14.84, 
yields the following recursion relation 


(—w? + 2w?) aj — wi (aj—1 + aj41) = 0 (14.87) 


where j = 1,2,...n. Note that the boundary conditions, qo = 0 and qn+1 = 0 require that ao = an+ı = 0. 

The above recursion relation corresponds to a system of n homogeneous algebraic equations with n 
unknowns a1,42,...dn. A non-trivial solution is given by setting the determinant of its coefficients equal to 
Zero 


—w + Qu)? —w? 0 0 
—w? —w* + 2w —w? 0 
0 —w? —w? + Qu? —w? =0 (14.88) 
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This secular determinant corresponds to the special case of nearest neighbor interactions with the kinetic 
energy tensor T being diagonal and the potential energy tensor V involving coupling only to adjacent 
masses. The secular determinant is of order n and thus determines exactly n eigen frequencies w, for each 
polarization mode. 

For large n, the solution of this problem is more efficiently obtained by using a recursion relation approach, 
rather than solving the above secular determinant. The trick is to assume that the phase differences ¢,. 
between the motion of adjacent masses all are identical for a given polarization. Then the amplitude for the 


jt” mass for the r'” frequency mode wp is of the form 


ajr = apet? ir) (14.89) 
Insert the above into the recursion relation (14.87) gives 
(—w? + 2w?) — wg [er + ettr] = 0 (14.90) 


which reduces to 
w? = 2w? — 24? cos ¢, = 4w? sin? e 
that is 
wr = 2wo sin oe (14.91) 
where r = 1, 2,3,....n. 

Now it is necessary to determine the phase angle ¢, which can be done by applying the boundary 
conditions for standing waves on the lattice chain. These boundary conditions for stationary modes require 
that the ends of the lattice chain are nodes, that is ao, = a(n+1),, = 0. Using the fact that only the real 
part of az, has physical meaning, leads to the amplitude for the jt” mass for the r** mode to be 


ajr = ar COS (J, — Ôr) (14.92) 


The boundary condition ao, = 0 requires that the phase 6, = 5. That is 


Ajr = Gy COS (so, — Z) = arsin jọ, (14.93) 


where r = 1, 2,...,n. 
The boundary condition for j = n + 1, gives 


anyir = 0 = ay sin (n + 1) ¢, (14.94) 
Therefore 
(n+1)¢,=rn (14.95) 
where r = 1,2,3,...,n. That is 
i TT rrd _ rrd 3 krd (14.96) 


“ntl (n+Dd D 2 


where D = (n+ 1)d is the total length of the discrete lattice chain. 
The n eigen frequencies for a given polarization are given by 


d d krd 
wr, = 2wosin ES = 20 sin OY = Qu, sin F = 2wo sin = (14.97) 
where the corresponding wavenumber kp is given by 
TT rm 2r 


r = eT = — = — 14. 
i (n+D)d D >œ mae) 


This implies that the normal modes are quantized with half-wavelengths + = 2. 
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Figure 14.9: Plots of the maximal vibrational amplitudes a, for the r*” frequency sinusoidal mode, versus 
distance along the chain, for transverse normal modes of a vibrating discrete lattice with n = 5. Only r = 
1,2,3,4,5, are distinct modes because r = 6 is a null mode. Note that the modes with r = 7,8, 9, 10, 11, 12, 
shown dashed, duplicate the locations of the mass displacement given by the lower-order modes. 


Combining equations 14.96 and 14.93 gives the maximum amplitudes for the eigenvectors to be 


krd 
Ajr = ür sinj- (14.99) 


For n independent linear oscillators there are only n independent normal modes, that is, for r = n + 1 the 
sine function in equation 14.97 must be zero. Beyond r = n the equations do not describe physically new 
situations. This is illustrated by figure 14.9 which shows the transverse modes of a lattice chain with n = 5. 
There are only n = 5 independent normal modes of this system since r = n + 1 = 6 corresponds to a null 
mode with all q,(t) = 0. Also note that the solutions for r > n + 1, shown dashed, replicate the mass 
locations of modes with r < n + 1, that is, the modes with r > 6 are replicas of the lower-order modes. 
Note that w, has a maximum value w, < 2wọ since the sine function cannot exceed unity. This leads 
to a maximum frequency we = 2wo, called the cut-off frequency, which occurs when krd = m. That is, the 
null-mode occurs when r = n + 1 for which equation 14.99 equals zero. The range of n quantized normal 
modes that can occur is intuitive. That is, the longest half-wavelength Amax = D = (n + 1)d equals the total 


length of the discrete lattice chain. The shortest half-wavelength Acutoft = d is set by the lattice spacing. 
Thus the discrete wavenumbers of the normal modes, for each polarization, range from kı to nk, where n is 
an integer. 

Assuming real k,, the normal coordinate 7, and corresponding frequency wr are, 


N, = aret (14.100) 


Equations 14.97 and 14.99 give the angular frequency and displacement. Note that superposition applies 
since this system is linear. Therefore the most general solution for each polarization can be any superposition 
of the form 


q(t) = 2 sin Fon (14.101) 
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14.10.4 Travelling waves 


Travelling waves are equally good solutions of the equations of motion 14.77, 14.84 as are the normal modes. 
Travelling waves on the one-dimensional lattice chain will be of the form 


g(x, t) = Cette) (14.102) 


where the distance along the chain z = vd, that is, it is quantized in units of the cell spacing d, with y being 
an integer. The positive sign in the exponent corresponds to a wave travelling in the —2 direction while 
the negative sign corresponds to a wave travelling in the +a direction. The velocity of a fixed phase of the 
travelling wave must satisfy that wt + kx is a constant. This will occur if the phase velocity of the wave is 
given by 


dr w 
phase === 14.103 
i de k (14109) 
The wave has a frequency f = and wavelength \ = 2, thus the phase velocity Uphase = EF = Af. 
Inserting the travelling wave 14.102 into the transverse equation of motion 14.84 for the discrete lattice 
chain gives 


—w qr = wa (e?r — 2 + ef gp (14.104) 


where j = 1,2,....n. That is 


wr = +2wo sin oe (14.105) 


The phase @¢, is determined by the Born-von Karman periodic boundary condition that assumes that the 
chain is duplicated indefinitely on either side of k = +4. Thus, for n discrete masses, k must satisfy the 
condition that qr = qr+n. That is 


gnd =] (14.106) 
That is 
2rr 
nd 
Note that the periodic boundary condition gives n discrete modes 
for wavenumbers between 


kr 


(14.107) 


¡— First Brillouin zone | 


T T 4 
== <k, < +5 14.108 1 
a+ ane l 
where the index 
n n 
= ——,—> + 1,..... —-1,= 
r 2 7 F P ? 2 ? 
Thus equation 14.105 becomes 


kind 
wr = +2wo sin => (14.109) 


Equation 14.109 is a dispersion relation that is identical to equa- 
tion 14.97 derived during the discussion of the normal modes of the 
lattice chain. This confirms that the travelling waves on the lat- 
tice chain are equally good solutions as the normal standing-wave 
modes. Clearly, superposition of the standing-wave normal modes 
can lead to travelling waves and vice versa. 


Figure 14.10: Plot of the dispersion 
curve (w versus k) for a monoatomic 
linear lattice chain subject to only 
nearest neighbor interactions. The 
first Brillouin zone is the segment be- 
tween —5 < k < 4 which covers all 


. . d 
14.10.5 Dispersion independent solutions. 


The lattice chain is an interesting example of a dispersive system in that w, is a function of k,. Figure 14.10 
shows a plot of the dispersion curve (w versus k) for a monoatomic linear lattice chain subject to only nearest 


neighbor interactions. Note that w depends linearly on & for small k and that ge = 0 at the boundaries of 
the first Brillouin zone. 
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The lattice chain has a phase velocity for the rt? wave given by 


+ krd 
sin kx 
peewee ar = pe i 2 | (14.110) 
r 2 
while the group velocity is 
krd 
pgroup — (=) = wod cos = (14.111) 


Note that in the limit when Ed => 0, the phase velocity and group velocity are identical, that is, v2’*°* = 
ygroUP = wod. 


14.10.6 Complex wavenumber 


The maximum allowed frequency, which is called the cut-off frequency, we = 2wo, occurs when krd = 7, that 
is, à =d. That is, the minimum half-wavelength equals the spacing d between the discrete masses. At the 
cut-off frequency, the phase velocity is v??4% = 2wod and the group velocity vg"°"? = 0. 

It is interesting to note that wr can exceed the cut-off frequency we = 2wo if kr is assumed to be complex, 


that is, if 


kr = Ky — il, (14.112) 
Then 
kd d , . krd Td. pd... Td 
wr = 2wo sin — = 2wy sin = (kr — iT) = 2wo | sin Er? cosh 2 — ¿cos E sinh (14.113) 
2 2 2 2 2 
To ensure that w, is real, the imaginary term must be zero, that is 
cos tud =0 (14.114) 
Therefore d 
sin => =1 (14.115) 
that is, kp = 4, and the dispersion relation between w and k for w > 2wgọ becomes 
Td 
Wy = 2wo cosh a (14.116) 


which increases with r. Thus, when w > we = 2wo then the amplitude of the wave is of the form 
qr (t) = ape Pra gilurtrrz) (14.117) 


which corresponds to a spatially damped oscillatory wave with phase velocity 


phase _ Or: 
vË == (14.118) 
and damping factor Tp. 

There are many examples in physics where the wavenumber is complex as exhibited by the discrete lattice 
chain for à < d. Other examples are electromagnetic waves in conductors or plasma (example 3.5), matter 
waves tunnelling through a potential barrier, or standing waves on musical instruments which have a complex 
wavenumber k due to damping. 

This simple toy model of the discrete linear lattice chain has illustrated that classical mechanics explains 
many features of the many-body nearest-neighbor coupled linear oscillator system, including normal modes, 
standing and travelling waves, cut-off frequency dispersion, and complex wavenumber. These phenomena 
feature prominently in applications of the quantal discrete coupled-oscillator system to solid-state physics. 
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14.11 Damped coupled linear oscillators 


The discussion of coupled linear oscillators has neglected non-conservative damping forces which always exist 
to some extent in physical systems. In general, dissipative forces are non linear which greatly complicates 
solving the equations of motion for such coupled oscillator systems. However, for some systems the dissipative 
forces depend linearly on velocity which allows use of the Rayleigh dissipation function, described in chapter 
10.4. The most general definition of the Rayleigh dissipation function, 10.4, was given to be 


1 n n y 
i=1 j= 


For this special case, it was shown in chapter 10 that the Lagrange equations can be written in terms of the 


Rayleigh dissipation function as 
d (OL OL OR 
: += =Q; 14.120 
{ dt Es ) 09; } aq * 


where Q; are generalized forces acting on the system that are not absorbed into the potential U. Using 
equations 14.43, 14.44, and 14.120, allows the equations of motion for damped coupled linear oscillators to 
be written in a matrix form as 


{T}4+ {C}q+{V}q={Q} (14.121) 
where the symmetric matrices {T}, {C}, and {V } are positive definite for positive definite systems. Rayleigh 
pointed out that in the special case where the damping matrix {C} is a linear combination of the {T} and 
{V} matrices, then the matrix {C} is diagonal leading to a separation of the damped system into normal 
modes. As discussed in chapter 4 many systems in nature are linear for small amplitude oscillations allowing 
use of the Rayleigh dissipation function which provides an analytic solution. However, in general, except for 
when {C} is small, this separation into normal modes is not possible for damped systems and the solutions 
must be obtained numerically. 

The following example illustrates approaches used to handle linearly-damped coupled-oscillator systems. 


14.11 Example: Two linearly-damped coupled linear oscillators 


Consider the two coupled oscillator system shown 
where the two carts have spring constants kı, k2 and 
linear damping constants cıc2. As discussed in exam- 
ple 14.3, the kinetic energy tensor is given by 


1, 1, 
T= ¿má + má (a) 
and the potential energy is given by Two linearly-damped coupled linear oscillators. 
1: 
OF So [Fai + ke (q2 — ar) 
1 


= [(kı + k2) q? — 2k2q1q2 + k2q5] (b) 


Similarly the Rayleigh dissipation function has the form 
1 k a 1 . a : 
R=5 [cid + co (43 — 4{)] = > [(c1 + c2) dí — 2c24142 + c24] (c) 


Inserting a,b, and c into equation 14.120 gives the two equations of motion to be 


midi + (c1 + c2) q1 — cado + (kı + k2) qı — k2Q2 
mado — c2ġı + C2g2 — koqu + koqg = 


When the drag is zero the solution of these two coupled equations can be separated into two independent 
normal modes of the system as described earlier. Usually it is not possible to separate the motion into 
decoupled normal modes except for certain cases where the dissipative forces can be described by Rayleigh’s 
dissipation function. 
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14.12 Collective synchronization of coupled oscillators 


Collective synchronization of coupled oscillators is a multifaceted phenomenon where large ensembles of 
coupled oscillators, with comparable natural frequencies, self synchronize leading to coherent collective modes 
of motion. Biological examples include congregations of synchronously flashing fireflies, crickets that chirp in 
unison, an audience clapping at the end of a performance, networks of pacemaker cells in the heart, insulin- 
secreting cells in the pancreas, as well as neural networks in the brain and spinal cord that control rhythmic 
behaviors such as breathing, walking, and eating. Example 14.13 illustrates an application to nuclei. 

An ensemble of coupled oscillators will have a frequency distribution with a finite width. It is interesting 
to elucidate how an ensemble of coupled oscillators, that have a finite width frequency distribution, can self 
synchronize their motion to a unique common frequency, and how that synchronization is maintained over 
long time periods. The answers to these issues provide insight into the dynamics of coupled oscillators. 

The discussion of coupled oscillators has implicitly assumed n identical undamped linear oscillators that 
have identical, infinitely-sharp, natural frequencies w;. In nature typical coupled oscillators can have a finite- 
width frequency distribution g(w) about some average value, due to the natural variability of the oscillator 
parameters for biological systems, the manufacturing tolerances for mechanical oscillators, or the natural 
Lorentzian frequency distribution associated with the uncertainty principle that occurs even for atomic clocks 
where the oscillator frequencies are defined directly by the physical constants. Assume that the ensemble of 
coupled oscillators has a frequency distribution g(w) about some average value. 

Undamped linear oscillators have elliptical closed-path trajectories in phase space whereas dissipation 
leads to a spiral attractor unless the system is driven such as to preserve the total energy. As described 
in chapter 4.4 many systems in nature, especially biological systems, have closed limit cycles in phase 
space where the energy lost to dissipation is replenished by a driving mechanism. The simplest systems for 
understanding collective synchronization of coupled oscillators are those that involve closed limit cycles in 
phase space. 

N. Wiener first recognized the ubiquity of collective synchronization in the natural world, but his mathe- 
matical approach, based on Fourier integrals, was not suited to this problem. A more fruitful approach was 
pioneered in 1975 by an undergraduate student A.T. Winfree[Win67] who recognized that the long-time be- 
havior of a large ensemble of limit-cycle oscillators can be characterized in the simplest terms by considering 
only the phase of closed phase-space trajectories. He assumed that the instantaneous state of an ensemble 
of oscillators can be represented by points distributed around the circular phase-space diagram shown in 
figure 14.11. For uncoupled oscillators these points will be distributed randomly around the circle, whereas 
coupling of the oscillators will result in a spatial correlation of the points. That is, the dynamics of the 
phases can be visualized as a swarm of points running around the unit circle in the complex plane of the 
phase space diagram. The complex order parameter of this swarm can be defined to be the magnitude and 
phase of the centroid of this swarm 


1 : 
re? = —) et (14.122) 


The centroid of the ensemble of points on the phase diagram has a 
magnitude r, designating the offset of the centroid from the center of 
the circular phase diagram, and y which is the phase of this centroid. 
A uniform distribution of points around the unit circle will lead to a 
centroid r = 0. Correlated motion leads to a bunching of the points 
around some phase value leading to a non-zero centroid r and angle 
Y. Ifthe swarm acts like a fully-coupled single oscillator then r = 1 
with an appropriate phase w. 

The Kuramoto model[Kur75, Str00] incorporates Winfree’s 
intuition by mapping the limit cycles onto a simple circular phase 
diagram and incorporating the long-term dynamics of coupled oscil- 
lators in terms of the relative phases for a mean-field system. That 
is, the angular velocity of the phase d; for the it” oscillator is 


e Figure 14.11: Order parameter for 
d; res ST, (ó.—0,) (14.123) weakly-coupled oscillators. 
tot INY J i 3 


j=1 
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Figure 14.12: Kuramoto model of collective synchronization of coupled oscillators. The left and center 
plots show the time and coupling strength dependence of the order parameter r. The right plot shows the 
frequency dependence including coupling (solid line) and without coupling (dashed line). 


where i = 1,2,,, N. Kuramoto recognized that mean-field coupling was the most tractable system to solve, 
that is, a system where the coupling is applicable equally to all the oscillators. Moreover, he assumed an 
equally-weighted, pure sinusoidal coupling for the coupling term T;;(0; — 0;) between the coupled oscillators. 
That is, he assumed 


Lylo- é) = Š sinl; — 4) (14.124) 


where K > 0 is the coupling strength, and the factor + ensures that the model is well behaved as N — oo. 
Kuramoto assumed that the frequency distribution g(w) was unimodular and symmetric about the mean 
frequency Q, that is g(Q + w) = g(Q— w). 

This problem can be simplified by exploiting the rotational symmetry and transforming to a frame of 
reference that is rotating at an angular frequency Q. That is, use the transformation 0, = fp, — Qt where 
0; is measured in the rotating frame. This makes g(w) unimodular with a symmetric frequency distribution 
about w = 0. The phase velocity in this rotating frame is 


ĝi = wi + 2 — sin(9; — 0;) (14.125) 


Kuramoto observed that the phase-space distribution can be expressed in terms of the order parameters r, Y 
—i0 


in that equation 14.122 can be multiplied on both sides by e *” to give 
¡Y 
Wp-O:) 2 (05-05) 
re Se 2 e (14.126) 
j=1 
Equating the imaginary parts yields 
¡q 
rsin (% —0;) = = ae (0; — 05) (14.127) 
This allows equation 14.125 to be written as 
0; = wi + Kr sin(y — 0;) (14.128) 


for i = 1,2,,N. Equation 14.128 reflects the mean-field aspect of the model in that each oscillator 0; is 
attracted to the phase of the mean field Y rather than to the phase of another individual oscillator. 
Simulations showed that the evolution of the order parameter with coupling strength K is as illustrated 
in figure 14.12. This simulation shows (1) for all K, when below a certain threshold Ke, the order parameter 
decays to an incoherent jitter as expected for random scatter of N points. (2) When K > Ke this incoherent 
state becomes unstable and the order parameter r grows exponentially reflecting the nucleation of small 
clusters of oscillators that are mutually synchronized. (3) The population of individual oscillators splits 
into two groups. The oscillators near the center of the distribution lock together in phase at the mean 
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angular frequency 2 and co-rotate with average phase y(t), whereas those frequencies lying further from 
the center continue to rotate independently at their natural frequencies and drift relative to the coherent 
cluster frequency 2. As a consequence this mixed state is only partially synchronized as illustrated on the 
right side of figure 14.12. The synchronized fraction has a 6-function behavior for the frequency distribution 
which grows in intensity with further increase in K. The unsynchronized component has nearly the original 
frequency distribution g(w) except that it is depleted in the region of the locked frequency due to strength 
absorbed by the 6-function component. 

Kuramoto’s toy model nicely illustrates the essential features of the evolution of collective synchronization 
with coupling strength. It has been applied to the study neuronal synchronization in the brain[Cum07]. The 
model illustrates that the collective synchronization of coupled oscillators leads to a component that has a 
single frequency for correlated motion which can be much narrower than the inherent frequency distribution 
of the ensemble of coupled oscillators. 


14.12 Example: Collective motion in nuclei 


The nucleus is an unusual quantal system that involves the coupled motion of the many nucleons. It 
exhibits features characteristic of the many-body classical coupled oscillator with coupling between all the 
valence nucleons. Nuclear structure can be described by a shell model of individual nucleons bound in weakly 
interacting orbits in a central average mean field that is produced by the summed attraction of all the nucleons 
in the nucleus. However, nuclei also exhibit features characteristic of collective rotation and vibration of a 
quantal fluid. For example, beautiful rotational bands up to spin over 60h are observed in heavy nuclei. These 
rotational bands are similar to those observed in the rotational structure of diatomic molecules. Actinide 
nuclei also can fission into two large fragments which is another manifestation of collective motion. 

Figure 14.13 shows the case of collective bands in SU populated by Coulomb exciting a 1355MeV 
23817 beam by a 2°8Pb target. This case exhibits both quadrupole and octupole collective rotational bands up 
to spin 40. The inset shows the moment of inertia plotted versus the angular rotational energy hw. The 
electromagnetic E2 transition rates correspond to collective motion of = 32 nucleons. Collective motion of 
many nucleons is the antithesis of shell model motion where the nucleons are assumed to follow independent 
orbiting motion like planets around the Sun. Although the nucleus is a quantal system, this strange dichotomy 
can be understood in terms of a classical rotating system having weak linear coupling between each of many 
similar harmonic oscillators; which in this case, are nucleons bound in a spheroidally-deformed shell-model 
potential well. 

The essential general feature of weakly-coupled identical oscillators is illustrated by the solutions of the 
three linearly-coupled identical oscillators where the most symmetric state is displaced in frequency from the 
remaining states. For n identical oscillators, one state is displaced significantly in energy from the remaining 
n — 1 degenerate states. This most symmetric state is pushed downwards in energy if the residual coupling 
force is attractive, and it is pushed upwards if the coupling force is repulsive. This symmetric state corresponds 
to the coherent oscillation of all the coupled oscillators, and carries all of the strength for the corresponding 
dominant multipole for the coupling force. In the nucleus this state corresponds to coherent shape oscillations 
of many nucleons. 

The weak residual electric quadrupole and octupole nucleon-nucleon correlations in the nucleon-nucleon 
interactions generate collective quadrupole and octupole motion in nuclei. The collective synchronization 
of such coherent quadrupole and octupole excitation leads to collective bands of states, that correspond to 
synchronized in-phase motion of the protons and neutrons in the valence oscillator shell. These modes 
correspond to rotations and vibrations about the center of mass. The attractive residual nucleon-nucleon 
interaction couples the many individual particle excitations in a given shell producing one coherent state 
that is pushed downwards in energy far from the remaining n — 1 degenerate states. This coherent state 
involves correlated motion of the nucleons that corresponds to a macroscopic oscillation of a charged fluid. 
For non-closed shell nuclei like 288U, the dominant quadrupole multipole in the residual nucleon-nucleon 
interaction leads to the ground state being a coherent state corresponding to ~ 16 protons plus ~ 20 neutrons 
oscillating in phase. The collective motion of the charged protons leads to electromagnetic E2 radiation 
with a transition decay amplitude being about 16 times larger than for a single proton. This corresponds to 
radiative decay probability being enhanced by a factor of = 256 relative to radiation by a single proton. This 
collective state corresponds to a macroscopic quadrupole deformation at low excitation energies that exhibits 
both collective rotational and vibrational degrees of freedom as shown in the figure. This coherent state is 


398 CHAPTER 14. COUPLED LINEAR OSCILLATORS 


40* 


130 e  Ground-state band 
K*=1" octupole band re (638) 
al 38 2 
120 e t 
a i 
d (616 36 
11 . i 
— > 36 t 4 4 de 
. i be 
$ 590 t 
= 100 . r f 
z d 34t 6134 ge (534 
e a: lo. 
= i 
-= 90 r (565) (535) j 
= 3 69 31 si 3) 
A 30 
80 e — + 
ss (491) 
. $ 28 
70 o a 1 
i j 472 
t 26 
ss 0 0.1 453 
t 24 
242 433 
20* 
A t 
+ \\ 109 t ; 
18 t 
613 381 y i 
16 $ P 
\ 350 = aa F 
. 6 
14 y : — t at 
ore) 13 33 is- 
: t 11+ + 
xip NS 
z s 
e 7? 2 10 
6* 194 2 =$ 
47149 > o 
3 


\ 
1010 


\ 


\ 


\ 


Figure 14.13: Collective rotational bands in the nucleus 99 U excited by Coulomb excitation. [Sim98] 


analogous to the correlated flow of individual water molecules in a tidal wave. The weaker octupole term in 
the residual interaction leads to an octupole [pear-shaped] coupled oscillator coherent state lying slightly above 
the quadrupole coherent state. In contrast to the rotational motion of strongly-deformed quadrupole-deformed 
nuclei, the octupole deformation exhibits more vibrational-like properties than rotational motion of a charged 
tidal wave. The observed large increase in moment of inertia at higher rotational frequencies, shown in the 
insert, is due to the Coriolis force aligning the individual valence nucleons along the rotational axis. Thus, 
although the nucleus 8 U is the epitome of a complicated many-body quantal system, it is apparent that 
basic classical mechanics of coupled oscillators, and rotation, underlie the physics phenomena exhibited by 
synchronized collective motion in the nuclear many-body system. 

The close correspondence between classical mechanics predictions, and the observed excitation phenomena 
observed for the 8U nucleus, is surprising for a system that is the epitome of a many-body quantal fluid. 
The following list identifies other manifestations of classical mechanics discussed in this book, that were 
exploited for study of such correlated motion of many-body nuclear systems. 


1. Coincident detection of the excited nuclei recoiling in vacuum was used to identify the exact scat- 
tering angles, plus recoil velocities, of the scattered nuclei. This specifies the hyperbolic Rutherford 
trajectory for each scattered nucleus, the nuclear masses, and their recoil velocities. The deexcitation 
y—rays, emitted in flight by each recoiling nucleus, were detected in coincidence with the scattered 
nuclei. Knowledge of the recoil velocities and scattering angles enabled correction for the Doppler shift 
in energy of each detected coincident y-ray to enhance the experimental energy resolution achieved by 
the y-ray detectors. 


2. The transition energies and angular distribution of the deexcitation y-rays determined the energies, 
spins, and parities of the excited states in ?°U. 


3. The measured yields of the coincident deexcitation y-rays determined the excitation cross section as a 
function of the nuclear scattering angle. 
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4. A full quantal calculation for this system is beyond the capabilities of modern computers since the 
experiment involves excitation of ~ 100 excited levels, coupled by about ~ 1000 electromagnetic matrix 
elements, and the scattering involves inclusion of thousands of partial wave due to the long range of the 
Coulomb potential for the heavy mass of the scattered nuclei. Therefore a semi-classical approximation 
is used for the quantal calculation of the electromagnetic excitation cross sections as a function of time 
as the scattered nuclei traverse Rutherford’s hyperbolic Coulomb scattering trajectory for each scattered 
nucleus. 


5. The measured cross section for the deexcitation y-rays are compared with the predicted cross sections 
to determine the ~ 1000 electromagnetic matrix elements connecting the states in 2°°U. 


6. The measured electromagnetic matrix elements have been measured in the laboratory frame of reference. 
Much more insight into the collective motion in ?°°U is obtained by transforming the electromagnetic 
matrix elements into the body-fixed frame of reference for this rotating deformed body. Rotational 
invariants, described in chapter 13.16, are used to derive the electromagnetic properties in the rotating 
body-fized frame of reference which unambiguously determines the electromagnetic shape for each excited 
nuclear state observed in 2981. 


7. Hamiltonian mechanics, based on the Routhian Rnoncyclic, is used to make theoretical model calculations 
of the nuclear structure of ?5U in the rotating body-fixed frame for comparison with the experimental 
data derived from this experiment. 


This experiment illustrates that classical mechanics plays a key role in all aspects of the study of the 
nuclear structure of the many-body nuclear quantal system. 


14.13 Summary 


This chapter has focussed on many—body coupled linear oscillator systems which are a ubiquitous feature in 
nature. A summary of the main conclusions are the following. 


Normal modes: It was shown that coupled linear oscillators exhibit normal modes and normal coordinates 
that correspond to independent modes of oscillation with characteristic eigenfrequencies w;. 


General analytic theory for coupled linear oscillators Lagrangian mechanics was used to derive the 
general analytic procedure for solution of the many-body coupled oscillator problem which reduces to the 
conventional eigenvalue problem. A summary of the procedure for solving coupled oscillator problems is as 
follows:. 

1) Choose generalized coordinates q, and evaluate T and U. 


1 A 
T=5 > Tin di de (14.41) 
j,k 
and 
1 n 
U = a 2 Vik 45h (14.42) 
3 


where the components of the T and V tensors are 


T 2 M tai Ptasi (14.43) 
jk= — a dq; Dax . 


4 


and 


0%U 
V= 14.44 
af ( 09504% ) 0 
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2) Determine the eigenvalues w, using the secular determinant. 


2 2 2 

Vaw Tai Vig—w*Tig Viz — w Tis 
2 2 2 

Via — w “Ti? Vz2—w Tə Vo3 —w*T33 


Vis — w?Ti3 Vaz —w2To3 V33 — w?T33 =0 (14.52) 
3) The eigenvectors are obtained by inserting the eigenvalues w, into 
Y (Vir — 7731) ajr = 0 (14.51) 
J 


4) From the initial conditions determine the complex scale factors 6,. where 
ny (t) = B, er? (14.58) 


5) Determine the normal coordinates where each 7, is a normal mode. The normal coordinates can be 
expressed as 


n={a}'q (14.61) 


Few-body coupled oscillator systems The general analytic theory was used to determine the solutions 
for parallel and series couplings of two and three linear oscillators. The phenomena observed include degen- 
erate and non-degenerate eigenvalues and spurious center-of-mass oscillatory modes. There are two broad 
classifications for three or more coupled oscillators, that is, either complete coupling of all oscillators, or 
coupling of the nearest-neighbor oscillators. It is observed that the eigenvalue corresponding to the most 
coherent motion of the coupled oscillators corresponds to the most collective motion and its eigenvalue is dis- 
placed the most in energy from the remaining eigenvalues. For some systems this coherent collective mode 
corresponded to a center-of-mass motion with no internal excitation of the other modes, while the other 
eigenvalues corresponded to modes with internal excitation of the oscillators such that the center of mass 
is stationary. The above procedure has been applied to two classification of coupling, complete coupling of 
many oscillators, and nearest neighbor coupling. Both degenerate and spurious center-of-mass modes were 
observed. Strong collective shape degrees of freedom in nuclei are examples of complete coupling due to the 
weak residual interactions between nucleons in the nucleus. It was seen that, for many coupled oscillators, 
one coherent state separates from the other states and this coherent state carries the bulk of the collective 
strength. 


Discrete lattice chain Transverse and longitudinal modes of motion on the discrete lattice chain were dis- 
cussed because of the important role it plays in nature, such as in crystalline lattice structures. Both normal 
modes and travelling waves were discussed including the phenomena of dispersion and cut-off frequencies. 
Molecules and the crystalline lattice chains are examples where nearest neighbor coupling is manifest. It 
was shown that, for the n—oscillator discrete lattice chain, there are only n independent longitudinal modes 
plus n modes for the two transverse polarizations, and that the angular frequency wr < 2w 9 that is, a cut-off 
frequency exists. 


Damped coupled linear oscillators It was shown that linearly-damped coupled oscillator systems can 
be solved analytically using the concept of the Rayleigh dissipation function. 


Collective synchronization of coupled oscillators The Kuramoto schematic phase model was used 
to illustrate how weak residual forces can cause collective synchronization of the motion of many coupled 
oscillators. This is applicable to biological systems as well as mechanical systems. 
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Workshop exercises 


1. Consider two masses (each of mass M) connected by a spring to each other and by springs to fixed positions. 
Motion is only allowed along one dimension. (This is exactly the same system that is discussed in chapter 
14.2 on coupled oscillations.) Let each of the two oscillator springs have a force constant « and let the force 
constant of the coupling spring be k12. Let xı and £2 be the coordinates as described in the textbook. 


(a) Draw a picture of the two masses displaced by a small amount. Using the picture, try to make sense of 
the equations of motion as given in the text: 


Mai, + (K+4')ay — 5/22 =O, Mia + (k +15 )x2 15/21 =0 
(b) Each of the trial solutions is written in the form Be'“*, Why are the trial solutions written this way? 


Are there any other ways to write the trial solution? 


(c) For a nontrivial solution to exist for the pair of simultaneous equations resulting from the substitution of 
the trial solution, the determinant of the coefficients of Bı and B2 must vanish. Why must this be the 
case? Is a similar statement true when considering three masses? What about n masses? 


(d) Suppose you had the actual two-mass system sitting in front of you. How could you create antisymmetric 
motion? How could you create symmetric motion? Can you describe each of these motions using a set of 
suitable initial conditions? 


2. Two particles, each with mass m, move in one dimension in a region near a local minimum of the potential 
energy where the potential energy is approximately given by 


1 
U = ¿Ha + 423 + 42112) 
where k is a constant. 


(a) Determine the frequencies of oscillation. 


(b) Determine the normal coordinates. 
3. What is degeneracy? When does it arise? 


4. The Lagrangian of three coupled oscillators is given by: 


3 -2 2 
m k 
> | Tn = e + k'(x1£2 + 2913). 
n=1 


2 
Find xa(t) for the following initial conditions (at t = 0): 
(£1, La, £3) = (Zo, 0, 0), ren (i1, £2, £3) = (0, 0, vo). 


5. A mechanical analog of the benzene molecule comprises a discrete lattice chain of 6 point masses M connected 
in a plane hexagonal ring by 6 identical springs each with spring constant k and length d. 
a) List the wave numbers of the allowed undamped longitudinal standing waves. 
b) Calculate the phase velocity and group velocity for longitudinal travelling waves on the ring. 
c) Determine the time dependence of a longitudinal standing wave for a angular frequency w = 2wcutof f, that 


is, twice the cut-off frequency. 


6. Consider a one dimensional, two-mass, three-spring system governed by the matrix A, 


4 -2 
(47) 
such that Ar = w?x, 


(a) Determine the eigenfrequencies and normal coordinates. 
(b) Choose a set of initial conditions such that the system oscillates at its highest eigenfrequency. 
(c) Determine the solutions x1(t) and x(t). 
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Problems 


1. Four identical masses m are connected by four identical springs, spring constant &, and constrained to move 
on a frictionless circle of radius b as shown on the left in the figure. 


a) How many normal modes of small oscillation are there? 
b) What are the eigenfrequencies of the small oscillations? 


c) Describe the motion of the four masses for each eigenfrequency. 


2. Consider the two identical coupled oscillators given on the right in the figure assuming K1 = ka = K. Let both 
oscillators be linearly damped with a damping constant 8. A force F = Fo cos(wt) is applied to mass mı. 
Write down the pair of coupled differential equations that describe the motion. Obtain a solution by expressing 
the differential equations in terms of the normal coordinates. Show that the normal coordinates 7, and Na 
exhibit resonance peaks at the characteristic frequencies w1 and wa respectively. 


k,=K | Ki | k,=K 


3. As shown on the left below the mass M moves horizontally along a frictionless rail. A pendulum is hung from 


M with a weightless rod of length b with a mass m at its end. 


a) Prove that the eigenfrequencies are 
= = g 
w1 =0 wa = ,/——(M +m) 


b) Describe the normal modes. 
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Chapter 15 


Advanced Hamiltonian mechanics 


15.1 Introduction 


This study of classical mechanics has involved climbing a vast mountain of knowledge, while the pathway to 
the top has led us to elegant and beautiful theories that underlie much of modern physics. Being so close to 
the summit provides the opportunity to take a few extra steps in order to provide a glimpse of applications 
to physics at the summit. These are described in chapters 15 — 18. 

Hamilton’s development of Hamiltonian mechanics in 1834 is the crowning achievement for applying vari- 
ational principles to classical mechanics. A fundamental advantage of Hamiltonian mechanics is that it uses 
the conjugate coordinates q, p, plus time t, which is a considerable advantage in most branches of physics 
and engineering. Compared to Lagrangian mechanics, Hamiltonian mechanics has a significantly broader 
arsenal of powerful techniques that can be exploited to obtain an analytical solution of the integrals of the 
motion for complicated systems. In addition, Hamiltonian dynamics provides a means of determining the 
unknown variables for which the solution assumes a soluble form, and is ideal for study of the fundamental 
underlying physics in applications to fields such as quantum or statistical physics. As a consequence, Hamil- 
tonian mechanics has become the preeminent variational approach used in modern physics. This chapter 
introduces the following four techniques in Hamiltonian mechanics: (1) the elegant Poisson bracket repre- 
sentation of Hamiltonian mechanics, which played a pivotal role in the development of quantum theory; (2) 
the powerful Hamilton-Jacobi theory coupled with Jacobi’s development of canonical transformation theory; 
(3) action-angle variable theory; and (4) canonical perturbation theory. 

Prior to further development of the theory of Hamiltonian mechanics, it is useful to summarize the major 
formula relevant to Hamiltonian mechanics that have been presented in chapters 7,8, and 9. 

Action functional S: 

As discussed in chapter 9.2, Hamiltonian mechanics is built upon Hamilton’s action functional 


S(q, p,t) = | “Ea q,t)dt (15.1) 


ti 


Hamilton’s Principle of least action states that 


ta 
ósta.pt)=0 | La att=0 (15.2) 
ty 


Generalized momentum p: 
In chapter 7.2, the generalized (canonical) momentum was defined in terms of the Lagrangian L to be 


L(q, q,t 
pi = OL, åt) (a qt) (15.3) 
0d; 
Chapter 9.2 defined the generalized momentum in terms of the action functional S to be 
05(q, pt) 
; = — 15.4 
Pj ðq; ( ) 
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Generalized energy h(q,4q,t) : 
Jacobi’s Generalized Energy h(q, q,t) was defined in equation 7.37 as 


haat) = (att) aan (15.5) 


j 04) 


Hamiltonian function: 
The Hamiltonian H (q, p,t) was defined in terms of the generalized energy h(q, å, t) plus the generalized 
momentum. That is 


H (q, pt) = h(q,4,t O (q, å, t) = p: å-L(q, å, t) (15.6) 


where p, q correspond to n-dimensional vectors, e.g. q = (q1, q2, ---, qn) and the scalar product p-q =>, pidi. 
Chapter 8.2 used a Legendre transformation to derive this relation between the Hamiltonian and Lagrangian 
functions. Note that whereas the Lagrangian L(q,q,t) is expressed in terms of the coordinates q, plus 
conjugate velocities q, the Hamiltonian H (q,p,t) is expressed in terms of the coordinates q plus their 
conjugate momenta p. For scleronomic systems, plus assuming the standard Lagrangian, then equations 
7.44 and 7.29 give that the Hamiltonian simplifies to equal the total mechanical energy, that is, H = T +U. 

Generalized energy theorem: 

The equations of motion lead to the generalized energy theorem which states that the time dependence 
of the Hamiltonian is related to the time dependence of the Lagrangian. 


ant dH (q, pt) ðLlq, å, t 


Note that if all the generalized non-potential forces and Lagrange multiplier terms are zero, and if the 
Lagrangian is not an explicit function of time, then the Hamiltonian is a constant of motion. 

Hamilton’s equations of motion: 

Chapter 8.3 showed that a Legendre transform plus the Lagrange-Euler equations led to Hamilton’s 
equations of motion. Hamilton derived these equations of motion directly from the action functional, as 
shown in chapter 9.2. 


; OH (q, p,t) 
Y ie Res 15. 
. OH “ 091 EXC 
y = -—(qpt)+ |) Mi +0) 15.9 
Dj Be (q, pt) 2 da, Q; (15.9) 
0H (q,pt) —  0L(q,d,t) 
> = = (15.10) 


Note the symmetry of Hamilton’s two canonical equations. The canonical variables pz, qx are treated 
as independent canonical variables. Lagrange was the first to derive the canonical equations but he did not 
recognize them as a basic set of equations of motion. Hamilton derived the canonical equations of motion 
from his fundamental variational principle and made them the basis for a far-reaching theory of dynamics. 
Hamilton’s equations give 2s first-order differential equations for px, qk for each of the s degrees of freedom. 
Lagrange’s equations give s second-order differential equations for the variables qk, dx. 

Hamilton-Jacobi equation: 

Hamilton used Hamilton’s Principle to derive the Hamilton-Jacobi equation. 


D iane 0 (15.11) 
Ot 

The solution of Hamilton's equations is trivial if the Hamiltonian is a constant of motion, or when a set 

of generalized coordinate can be identified for which all the coordinates q; are constant, or are cyclic (also 

called ignorable coordinates). Jacobi developed the mathematical framework of canonical transformations 

required to exploit the Hamilton-Jacobi equation. 
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15.2 Poisson bracket representation of Hamiltonian mechanics 


15.2.1 Poisson Brackets 


Poisson brackets were developed by Poisson, who was a student of Lagrange. Hamilton’s canonical equations 
of motion describe the time evolution of the canonical variables (q, p) in phase space. Jacobi showed that the 
framework of Hamiltonian mechanics can be restated in terms of the elegant and powerful Poisson bracket 
formalism. The Poisson bracket representation of Hamiltonian mechanics provides a direct link between 
classical mechanics and quantum mechanics. 

The Poisson bracket of any two continuous functions of generalized coordinates F(p,q) and G(p, q), is 


defined to be 
OF ôG OF OG 
a D 7 15.12 
LF Gap 2 e Op; Op; in) nak 


i 


Note that the above definition of the Poisson bracket leads to the following identity, antisymmetry, linearity, 
Leibniz rules, and Jacobi Identity. 


[F, F] =0 (15.13) 

[F,G] =- |G, F] (15.14) 

[G, F +Y] = [G, F] + [G, Y] (15.15) 

[G, FY] = [G, F] Y + F[G, Y] (15.16) 

0 = [F, [G, Y] + [G, [Y, F]] + [Y [F, G] (15.17) 


where G, H, and Y are functions of the canonical variables plus time. Jacobi’s identity; (15.17) states that 
the sum of the cyclic permutation of the double Poisson brackets of three functions is zero. Jacobi's identity 
plays a useful role in Hamiltonian mechanics as will be shown. 


15.2.2 Fundamental Poisson brackets: 


The Poisson brackets of the canonical variables themselves are called the fundamental Poisson brackets. 
They are 


0qx O 0qx O 

[a Wap = >, (52 = a s2) =>) Gri 0—0- ôu) =0 (15.18) 
opk O opk O 

Pi» Pilap = a (e ae E = oa 7 es (0 - du; — Ski 0) = 0 (15.19) 
0qy O 0q; O 

[dk Pi) y = 5 (Se = F. ae) = X (Oni ôu — 9-0) = Ox (15.20) 


In summary, the fundamental Poisson brackets equal 


ldk Wop = 0 (15.21) 
[Pr Play = 0 (15.22) 
lak, Dil gp TA [pi delap = Onl (15.23) 


Note that the Poisson bracket is antisymmetric under interchange in p and q. It is interesting that the only 
non-zero fundamental Poisson bracket is for conjugate variables where k = 1, that is 


ldk Prlpg = 1 (15.24) 


406 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS 


15.2.3 Poisson bracket invariance to canonical transformations 


The Poisson brackets are invariant under a canonical transformation from one set of canonical variables 
(dk, pk) to a new set of canonical variables (Qg, Pk) where Qk — Qz(q, p) and Pe — Py(q, p). This is shown 
by transforming equation 15.12 to the new variables by the following derivation 


OF OG OF OG 
[F,Gl,, = ( ) (15.25) 
2 ðq; p; Öp; 095 


a (Fees a a sae 
ik 0qj NOQx Op;  0OPx Op; Opj \OQx dq;  0Px dq; 
The terms can be rearranged to give 
OG OG 
PA = (so IF Oilo + zp [E Pilo) (15.27) 


Let F = Qk and replace G by F, and use the fact that the fundamental Poisson brackets [Qz, Ola =0 
and [Qx, Pilap = 6jx, then equation 15.25 reduces to 


OF OF OF 
That is be 
[F, Qk] = A (15.29) 
Similarly 
OF OF 
[Pe Fla = 2 (57. Pe Qla + Bor [Po Pile) (15.30) 
leading to 
OF 
Esa Prop =a OQk (15.31) 
Substituting equations (15.29) and (15.31) into equation (15.27) gives 
OF 0G OF OG 
F, Glop = o a a ar aE 15.32 
IF Glop lao ôP, 70) Por (aaa 


k 


Thus the canonical variable subscripts (q, p) and (Q, P) can be ignored since the Poisson bracket is 
invariant to any canonical transformation of canonical variables. The counter argument is that if the Poisson 
bracket is independent of the transformation, then the transformation is canonical. 


15.1 Example: Check that a transformation is canonical 


The independence of Poisson brackets to canonical transformations can be used to test if a transformation 
is canonical. Assume that the transformation equations between two sets of coordinates are given by 


Q =1n (1+ q? cosp) P =2(1 +q? cosp) a? sinp 


Evaluating the Poisson brackets gives |Q, Q] = 0, [P,P] =0 while 


0QO0P OPAQ 
[Q, P] = 
ðq Op ðq Op 
ai dls Eno 
= A ala a + q? cos p)q? cos p| + AP esp + (1 +q? cosp)q”?] =:1 
1+ q? cosp 1 + q? cosp 


Therefore if q,p are canonical with a Poisson bracket [q,p] = 1, then so are Q, P since [Q, P] = 1 = [q, p] . 
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Since it has been shown that this transformation is canonical, it is possible to go further and determine 
the function that generates this transformation. Solving the transformation equations for q and p give 


q= (e2 — 1)’ sec? p P = 22 (e — 1) tanp 


Since the transformation is canonical, there exists a generating function F; (Q,p) such that 


The transformation function F3(Q,p) can be obtained using 


OF: OF: 
dF3(Q,P) = got y dp = —PaQ — adp 


= -d [ee - 1) 7 tanp — (e? — 1)” dtanp = —d [(e? a 1)” tan p| 
This then gives that the required generating function is 


2 
F3(Q,p) = (e? — 1)" tanp 
This example illustrates how to determine a useful generating function and prove that the transformation is 
canonical. 
15.2.4 Correspondence of the commutator and the Poisson Bracket 


In classical mechanics there is a formal correspondence between the Poisson bracket and the commutator. 
This can be shown by deriving the Poisson Bracket of four functions taken in two pairs. The derivation 
requires deriving the two possible Poisson Brackets involving three functions. 


OF; OF,\ oG OF, OF)\ oG 
F F: = — F: F; — — F: F | — 
deis Es y Ja” E a iala) 
= [P,,G] Fə + Fi [Fo, G] (15.33) 
[F, G1G)| = |F, Gi] G2 +G, |F, Gə] (15.34) 


These two Poisson Brackets for three functions can be used to derive the Poisson Bracket of four functions, 
taken in pairs. This can be accomplished two ways using either equation 15.33 or 15.34. 


[Fi Fa, Gi Go] = [Fi, GG] Fy + Fy [Fo, G1 Go] 
= {[F1, G1] G2 + Gi (Fi, Gol} Fo + Fi {[F2, Gi] Go + Gi [F2, Go]! 
= [Fi, Gi] GəFə + Gi [P,, Ga] Fə + Fi [F2, G1] Ga + FG [F>2, Ga] (15.35) 


The alternative approach gives 


[Fi Fo, G1Go] [Fi Fa G1] G2 + Gy [Fi F>, Go] 


[Fi, Gi] FoGo4+ Fi F>, Gı] Gə + G1 [Fi, Gə] Fə + GF; |F>, Gə] (15.36) 


These two alternate derivations give different relations for the same Poisson Bracket. Equating the alternative 
equations 15.35 and 15.36 gives that 


[Fi, Gi] (F2G2 — GoF2) = (FiGi — Gi Fi) [F2, Ga] 


This can be factored into separate relations, the left-hand side for body 1, and the right-hand side for body 


De 
(FG; —G,F\) (FoG — G2 F>) 
Le A A AAN 15.37 
FoG] Fs, Ga] (18:37) 
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Since the left-hand ratio holds for F,,G, independent of Fb,G2, and vise versa, then they must equal 
a constant A that does not depend on F,,G¡, does not depend on F2,G2, and A must commute with 
(F\G, — G1 Fı). That is, A must be a constant number independent of these variables. 


(AiG, — GiF\) =A [FG] = A) > ( (15.38) 


2 


OF, 0G, OF, 0G, 
Og Op; Opi qi 


Equation 15.38 is an especially important result which states that to within a multiplicative constant number 
A, there is a one-to-one correspondence between the Poisson Bracket and the commutator of two independent 
functions. An important implication is that if two functions, F;Gy, have a Poisson Bracket that is zero, then 
the commutator of the two functions also must be zero, that is, F; and Gk commute. 

Consider the special case where the variables F¡ and G; correspond to the fundamental canonical vari- 
ables, (qk, pı). Then the commutators of the fundamental canonical variables are given by 


dept — pide = À [dk pi] = AOxi (15.39) 
arg —gde = Algk] =0 (15.40) 
PkPL— Pipe = Alpx,pi] = 0 (15.41) 


In 1925, Paul Dirac, a 23-year old graduate student at Bristol, recognized that the formal correspondence 
between the Poisson bracket in classical mechanics, and the corresponding commutator, provides a logical 
and consistent way to bridge the chasm between the Hamiltonian formulation of classical mechanics, and 
quantum mechanics. He realized that making the assumption that the constant A = zh, leads to Heisenberg’s 
fundamental commutation relations in quantum mechanics, as is discussed in chapter 18.3.1. Assuming that 
A = th provides a logical and consistent way that builds quantization directly into classical mechanics, rather 
than using ad-hoc, case-dependent, hypotheses as was used by the older quantum theory of Bohr. 


15.2.5 Observables in Hamiltonian mechanics 


Poisson brackets, and the corresponding commutation relations, are especially useful for elucidating which 
observables are constants of motion, and whether any two observables can be measured simultaneously and 
exactly. The properties of any observable are determined by the following two criteria. 


Time dependence: 


The total time differential of a function G (qi, pi, t) is defined by 


a = ĉc + 2 (Sea + sa) (15.42) 
Hamilton’s canonical equations give that 
di = o (15.43) 
pi =- ES (15.44) 
Substituting these in the above relation gives 
dG _ 0G lao | 
dt — Ot 4 \ Aq; ðpi OD; OG 
that is 
a y E +[G,H] (15.45) 


This important equation states that the total time derivative of any function G(q,p,t) can be expressed in 
terms of the partial time derivative plus the Poisson bracket of G(q, p,t) with the Hamiltonian. 
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Any observable G(p, q,t) will be a constant of motion if de = 0, and thus equation (15.45) gives 


A + [G, H] = 0 (If G is a constant of motion) 
That is, it is a constant of motion when 
oG 
— =|H,G 15.46 
>, = [H,G] (15.46) 


Moreover, this can be extended further to the statement that if the constant of motion G is not explicitly 
time dependent then 
[G, H] = 0 (15.47) 


The Poisson bracket with the Hamiltonian is zero for a constant of motion G that is not explicitly time 
dependent. Often it is more useful to turn this statement around with the statement that if [G, H] = 0, and 


oe = 0, then qe = 0, implying that G is a constant of motion. 


Independence 


Consider two observables F(p, q,t) and G(p, q,t). The independence of these two observables is determined 
by the Poisson bracket 


[F, G] = —[G, F] (15.48) 


If this Poisson bracket is zero, that is, if the two observables F'(p,q,t) and G(p,q,t) commute, then their 
values are independent and can be measured independently. However, if the Poisson bracket |F, G] 4 0, that 
is F(p,q,t) and G(p,q,t) do not commute, then F and G are correlated since interchanging the order of 
the Poisson bracket changes the sign which implies that the measured value for F depends on whether G is 
simultaneously measured. 

A useful property of Poisson brackets is that if F and G both are constants of motion, then the double 
Poisson bracket |H, [F, G]] = 0. This can be proved using Jacobi’s identity 


[F, [G, H]] + [G, |H, F]] + (A, [FG] =0 (15.49) 


If |G, H] = 0 and [F, H] = 0, then |H, [F, G]] = 0, that is, the Poisson bracket [F, G] commutes with H. Note 


that if F and G do not depend explicitly on time, that is 2 = 2G = 0, then combining equations (15.45) 


ðt 
and (15.49) leads to Poisson’s Theorem that relates the total time derivatives. 


gincl= |Z] + AE (15.50) 


This implies that if F and G are invariants, that is E = Je = 0, then the Poisson bracket [F,G] is an 


invariant if F and G are not explicitly time dependent. 


15.2 Example: Angular momentum: 


Angular momentum, L, provides an example of the use of Poisson brackets to elucidate which observables 
can be determined simultaneously. Consider that the Hamiltonian is time independent with a spherically 
symmetric potential U(r). Then it is best to treat such a spherically symmetric potential using spherical 
coordinates since the Hamiltonian is independent of both 0 and o. 

The Poisson Brackets in classical mechanics can be used to tell us if two observables will commute. Since 
U(r) is time independent, then the Hamiltonian in spherical coordinates is 


Evaluate the Poisson bracket using the above Hamiltonian gives 


[ps, H] =0 
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Since pg is not an explicit function of time, dps — 0, then ape — 0, that is, the angular momentum about 
$ Ot dt g 


the z axis L; = pg is a constant of motion. 
The Poisson bracket of the total angular momentum L? commutes with the Hamiltonian, that is 


[L?, H] = 


P 
2 
+ ——,H| =0 
Po sin? 0 | 


2 

Since the total angular momentum L? = pi + a is not explicitly time dependent, then it also must be 
a constant of motion. Note that Noether’s theorem gives that both the angular momenta L? and L, are 
constants of motion. Also since the Poisson brackets are 

[L.,H] = 0 

[L?, H] = 0 
then Jacobi’s identity, equation 15.17, can be used to imply that 

(A, [L?, Le] =0 
That is, the Poisson bracket [L?, L.] is a constant of motion. Note that if L? and L, commute, that is, 
[L?, Lz] = 0, then they can be measured simultaneously with unlimited accuracy, and this also satisfies that 


[7,2 commutes with H. 
The (x,y,z) components of the angular momentum L are given by 


n n 

Ly = Ss (r x p), = DB (YiPz,i — ZiPy,i) 
i=l i=1 
n n 

Ly = oe (xp), = e (2iPx,i — TiPz,i) 
i=1 i=1 
n n 

Le = X Exp), =>) (ibys YiPei) 
i=1 ¿=1 


Evaluate the Poisson bracket 


OL: ðL OL, OL ðL, ðL OL, OL OL, OL OL, OL 
T L s z y x Y x Y x Y r y £ y 
| l 2 2 ( Ox; Ope, i ODz i mt) E (= pyi Py i OY: ) ü (= pz i opz i Ozi )| 


i=l 
n 


= Y [(0) + (0) + (£iPyi — YiPz,i)] = La 


i=l 


Similarly, Poisson brackets for Lz, Ly, L; are 


Lgyly|] = Ls 
Ly, L] a Ly 
Lz, Le] = Ly 


where x,y, and z are taken in a right-handed cyclic order. This usually is written in the form 


Lis Lj] = eijk Lk 


where the Levi-Civita density €ijk equals zero if two of the ijk indices are identical, otherwise it is +1 for a 
cyclic permutation of i,j,k, and —1 for a non-cyclic permutation. 

Note that since these Poisson brackets are nonzero, the components of the angular momentum Lg, Ly, Lz 
do not commute and thus simultaneously they cannot be measured precisely. Thus we see that although L? and 
L; are simultaneous constants of motion, where the subscript i can be either x,y, or z, only one component 
L; can be measured simultaneously with L?. This behavior is exhibited by rigid-body rotation where the body 
precesses around one component of the total angular momentum, L,, such that the total angular momentum, 
L?, plus the component along one axis, L, are constants of motion. Then L + LZ = L? — L2 is constant 
but not the individual Ly or Ly. 
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15.2.6 Hamilton’s equations of motion 


An especially important application of Poisson brackets is that Hamilton’s canonical equations of motion 
can be expressed directly in the Poisson bracket form. The Poisson bracket representation of Hamiltonian 
mechanics has important implications to quantum mechanics as will be described in chapter 18. 
In equation (15.45) assume that G is a fundamental coordinate, that is, G = qx,. Since qx is not explicitly 
time dependent, then 
dqy 04% 
— = — H 15.51 
dt Ot an [dk > ] ( ) 
= id (Se OH Od 7) 
i Ogi 0p; Opi Odi 


OH OH 
o » (Su e aa 


OH 
— 15.52 
Oe ( ) 
That is aH 
dk = lak, H] = age (15.53) 
Pk 


Similarly consider the fundamental canonical momentum G = px. Since it is not explicitly time dependent, 
then 


dpr Or 
aa ae + [px, H] (15.54) 
a 0 + opk OH a opk OH 
z Og: Opi Op; 0q; 
OH OH 
0 a, | 
OH 
A 15.55 
Dan ( ) 
That is 0H 
Pr = [px, H] = E7 (15.56) 
dk 


Thus, it is seen that the Poisson bracket form of the equations of motion includes the Hamilton equations 
of motion. That is, 


ðH 
> = H| = — E 
dk (ax, H] Op, (15.57) 
, ðH 
Peo = (pr, H] = -57 (15.58) 
dk 


The above shows that the full structure of Hamilton’s equations of motion can be expressed directly in 
terms of Poisson brackets. 

The elegant formulation of Poisson brackets has the same form in all canonical coordinates as the Hamil- 
tonian formulation. However, the normal Hamilton canonical equations in classical mechanics assume implic- 
itly that one can specify the exact position and momentum of a particle simultaneously at any point in time 
which is applicable only to classical mechanics variables that are continuous functions of the coordinates, 
and not to quantized systems. The important feature of the Poisson Bracket representation of Hamilton’s 
equations is that it generalizes Hamilton’s equations into a form (15.57, 15.58) where the Poisson bracket is 
equally consistent with both classical and quantum mechanics in that it allows for non-commuting canonical 
variables and Heisenberg’s Uncertainty Principle. Thus the generalization of Hamilton’s equations, via use 
of the Poisson brackets, provides one of the most powerful analytic tools applicable to both classical and 
quantal dynamics. It played a pivotal role in derivation of quantum theory as described in chapter 18. 
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15.3 Example: Lorentz force in electromagnetism 


Consider a charge q, and mass m, in a constant electromagnetic fields with scalar potential $ and vector 
potential A. Chapter 6.10 showed that the Lagrangian for electromagnetism can be written as 


1 
L=>zmx-x-q(8 — A-x) 


2 
The generalized momentum then is given by 
OL 
Ox 
Thus the Hamiltonian can be written as 
_gA) 
H=(p 3) - r= 2A 546 
2m 
The Hamilton equations of motion give 
—qA 
ieee 
m 


and 
p = [pH] =-qV® + — {(p-4A) x (V x A)} 


Define the magnetic field to be 


B=VxA 
and the electric field to be 
OA 
E=-V9- — 
X Ot 


then the Lorentz force can be written as 


15.4 Example: Wavemotion: 

Assume that one is dealing with traveling waves of the form Y = Act aeP=—wt) for a one-dimensional 
conservative system of many identical coupled linear oscillators. Then evaluating the following Poisson 
brackets gives 


ae) 


Thus pz,x,w, and t are constants of motion. However, 


[Pss] # 0 
[wt] 4 0 
Thus one cannot simultaneously measure the conjugate variables (px) or (w,t). This is the Uncertainty 


Principle that is manifest by all forms of wave motion in classical and quantal mechanics as discussed in 
chapter 3.11.3. 
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15.5 Example: Two-dimensional, anisotropic, linear oscillator 


Consider a mass m bound by an anisotropic, two-dimensional, linear oscillator potential. As discussed 
in chapter 11, the motion can be described as lying entirely in the x— y plane that is perpendicular to the 
angular momentum J. It is interesting to derive the equations of motion for this system using the Poisson 
bracket representation of Hamiltonian mechanics. 

The kinetic energy is given by 


a 11, ; 
T (£, 9) = zm (5? + y?) 
The linear binding is reproduced assuming a quadratic scalar potential energy of the form 


U (x,y) = sh (£? + y?) + ney 


where 7 is the anharmonic strength that coupled the modes of the isotropic linear oscillator. 
a) NORMAL MODES: As discussed in chapter 14, a transformation to the normal modes of the system 
is given by using variables (a, 3) where a = 5 (x+y) and B= Ww (x — y), that is 


t= (a +6) y= -z5 (a-p) 


2 v2 
Express the kinetic and potential energies in terms of the new coordinates gives 


Gay = qm |(a+8) + (8-8) | = jm (0? +8") 


1 1 1 1 
U = zk[e+8? +l- 8] + 50 (a? - 6) = 5 (k+n)o? lkn 
Note that the coordinate transformation makes the Lagrangian separable, that is 
1 . 1 1 
L=3m(8+8) -(6+m) 0245 (k-n)? = Lat Lg 


where i i i j 
La = mos - (k 2 Lg = mb — (k-n) 8? 
a = Má zí +n)a 8 ¿mb = n) B 


This shows that that the transformation has separated the system into two normal modes that are harmonic 
oscillators with angular frequencies 


Note that the non-isotropic harmonic oscillator reduces to the isotropic linear oscillator when n = 0. 


b) HAMILTONIAN: The canonical momenta are given by 


B ðL ; 

Pa = ga T9 
OL . 

PB = ap 7 


The definition of the Hamiltonian gives 
H = pai tpp- L=; (p2 +p) + = (b+ n) a? + = (k— 1) 8 
Dar ORE! 9 2 
Note that this can be factored as 


where 
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Using the Poisson Bracket expression for the time EDN AONE, equation 15.45, and using the fact that 
the Hamiltonian is not explicitly time dependent, that is, on = 0, gives 


dH. OF, 
a = et HeH] = 0+ [Hos Ha + Hp] = [Hos Ag) 


OH, ðH; OH, 0Hg _0H,0Hg 0H, 0Hg _ 
Oa Opa OB Ops Opa Oa pg OB E 


Similarly He = 0. This implies that the Hamiltonians for both normal modes, Ha and Hg, are time- 
ae. ronson of motion which are equal to the total energy for each mode. 
c) ANGULAR MOMENTUM: The angular momentum for motion in the af plane is perpendicular to 
the aß plane with a magnitude of 
J=m (apg a Bpa) 
The time dependence of the angular momentum is given by 
dJ FE O ot OA orga OJ OH 
dt —at Oa Op, Op. Oa Oppg pg OB 
= PgPa + Mkba + mba — pape — mkaB + mpa = 2mnPa 


Note that if y = 0, then the two eigenfrequencies, are degenerate, wa = wg, that is, the system reba to 
the isotropic harmonic oscillator in the aß plane that was discussed in chapter 11.9. In addition, 4 a7 = 0 
for n= 0, that is, the angular momentum J in the aß plane is a constant of motion when y = 0. 


d) SYMMETRY TENSOR: The symmetry tensor was defined in chapter 11.9.3 to be 


/ =i 


+ kasa 


where i and j can correspond to either a or B. The symmetry tensor defines the orientation of the major 
axis of the elliptical orbit for the two-dimensional, isotropic, linear oscillator as described in chapter 11.9.3. 

The isotropic oscillator has been shown to have two normal modes that are degenerate, therefore a and 
B are equally good normal modes. The Hamiltonian showed that, for y = 0, the Hamiltonian gives that the 
total energy is conserved, as well as the energies for each of the two normal modes which are. 


Bee mipi ka? pe 2 Epp? 
Consider the matrix element 
ij = 5 += she 
where i,j each can represent œ or B. Then for each matrix element 
dA;; _ Aij ðA; 0H A; ƏH 0A; ƏH 3A; OH 


poa RRS Ot 


da. Opa Op. ða | OB Opg Ops OB 


That is, each matriz element Ais, commutes with the Hamiltonian 


[Ai H] =0 
Thus the Poisson Brackets representation of Hamiltonian mechanics has been used to prove that the 
symmetry tensor Al, = 52 + Ska ja; is a constant of motion for the isotropic harmonic oscillator. That is, 


all the elements a , Agg, and Aj, of the symmetric tensor A’ commute with the Hamiltonian. 
Note that the three constants of motion, L, A’ and H for the isotropic, two-dimensional, linear oscillator 
form a closed algebra under the Poisson Bracket formalism. 


15.6 Example: The eccentricity vector 


Chapter 11.8.4 showed that Hamilton’s eccentricity vector for the inverse square-law attractive force, 


A =(p x L) + (wk?) 
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is a constant of motion that specifies the major axis of the elliptical orbit. The eccentricity vector for the 
inverse-square-law force can be investigated using Poisson Brackets as was done for the symmetry tensor 
above. It can be shown that 


[Li, Aj] = €ijxAr 
2 k 
[A;, Aj] = -2 (E + = Eijk Lk (a) 


Note that the bracket on the right-hand side of equation (a) equals the Hamiltonian H for the inverse square- 
law attractive force, and thus the Poisson bracket equals 


e osk 
[A;, Aj] = -2 e + 3) EijkLr = —2He€ijk Lk 


For the Hamiltonian H it can be shown that the Poisson bracket 
[H, A] =0 


That is, the eccentricity vector commutes with the Hamiltonian and thus it is a constant of motion. Previously 
this result was obtained directly using the equations of motion as given in equation 11.87. Note that the three 
constants of motion, L, A and H form a closed algebra under the Poisson Bracket formalism similar to 
the triad of constants of motion, L, A' and H that occur for the two-dimensional, isotropic linear oscillator 
described above. Examples 15.5 and 15.6 illustrate that the Poisson Brackets representation of Hamiltonian 
mechanics is a powerful probe of the underlying physics, as well as confirming the results obtained directly 
from the equations of motion as described in chapter 11.8.4 and 11.9.8. 


15.2.7 Liouville’s Theorem 


Liouvilles Theorem illustrates an application of Poisson Brackets 
to Hamiltonian phase space that has important implications for 
statistical physics. The trajectory of a single particle in phase 
space is completely determined by the equations of motion if the 
initial conditions are known. However, many-body systems have 
so many degrees of freedom it becomes impractical to solve all 
the equations of motion of the many bodies. An example is a 
statistical ensemble in a gas, a plasma, or a beam of particles. 
Usually it is not possible to specify the exact point in phase space 
for such complicated systems. However, it is possible to define an 
ensemble of points in phase space that encompasses all possible 
trajectories for the complicated system. That is, the statistical 
distribution of particles in phase space can be specified. 
Consider a density p of representative points in (q, p) phase 
space. The number N of systems in the volume element dv is 


N = pdv (15.59) 


where it is assumed that the infinitessimal volume element 
dv = dq, dq>....dqs dpı, dp2....dps contains many possible sys- Figure 15.1: Infinitessimal element of area 
tems so that p can be considered a continuous distribution. For in phase space 
the conjugate variables (q;,p;) shown in figure 15.1, the number 
of representative points moving across the left-hand edge into 
the area per unit time is 
påidpi (15.60) 


The number of representative points flowing out of the area along the right-hand edge is 


2 


e ty CON ti. 
PG + va (pd) da; | dpi (15.61) 
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Hence the net increase in p in the infinitessimal rectangular element dq;dp; due to flow in the horizontal 
direction is 


o 
—— (på;) dqidp; 15.62 
de (p4i) dasdp (15.62) 
Similarly, the net gain due to flow in the vertical direction is 
a 
En (epi) dpi dq; (15.63) 
Pi 
Thus the total increase in the element dq;dp; per unit time is therefore 
o o 
= 1) + — (pps) | dp, da; 15.64 
aa (id + yy, (090) dt (15.64) 


Assume that the total number of points must be conserved, then the total increase in the number of 
points inside the element dq;dp; must equal the net changes in p on the infinitessimal surface element per 
unit time. That is 


Op 
dq; dp; 15.65 
( ə 2) qidp; (15.65) 
Thus summing over all possible values of i gives 
o Ord 
a ps EA (pd) Toa coño) =0 (15.66) 
Pi 
N ð ðp ð Op  ðå 
p . OP . OP Pi qi 
E y y E =0 15.67 
a + he ag? By | +? [Se + ae pane 
Inserting Hamilton’s canonical equations into both brackets and differentiating the last bracket results in 
Op OH Op OH Op 02H 3’ H 
= =—— — —|=0 15.68 
pT 2 E ðq qi 0pi) * oe 0p;0q,  0ps04; ee 


The two terms in the last bracket cancel and thus 


Op OH Op OH Op Op 
> A ye H=0 15.69 
ý ES Og Oqi Op; Ot + lp A] ( ) 
However, this just equals 22, therefore 
dp Op 
— — H = 1 . 
T= a tH =0 (15.70) 


This is called Liouville’s theorem which states that the rate of change of density of representative 
points vanishes, that is, the density of points is a constant in the Hamiltonian phase space along a specific 
trajectory. Liouville's theorem means that the system acts like an incompressible fluid that moves such as to 
occupy an equal volume in phase space at every instant, even though the shape of the phase-space volume 
may change, that is, the phase-space density of the fluid remains constant. Equation (15.70) is another 
illustration of the basic Poisson bracket relation (15.45) and the usefulness of Poisson brackets in physics. 

Liouville’s theorem is crucially important to statistical mechanics of ensembles where the exact knowledge 
of the system is unknown, only statistical averages are known. An example is in focussing of beams of charged 
particles by beam handling systems. At a focus of the beam, the transverse width in x is minimized, while 
the width in py is largest since the beam is converging to the focus, whereas a parallel beam has maximum 
width x and minimum spreading width p,. However, the product xp, remains constant throughout the 
focussing system. For a two dimensional beam, this applies equally for the y and p, coordinates, etc. It is 
obvious that the final beam quality for any beam transport system is ultimately limited by the emittance of 
the source of the beam, that is, the initial area of the phase space distribution. Note that Liouville’s theorem 
only applies to Hamiltonian q; — p; phase space, not to x — + Lagrangian state space. As a consequence, 
Hamiltonian dynamics, rather than Lagrange dynamics, is used to discuss ensembles in statistical physics. 

Note that Liouville’s theorem is applicable only for conservative systems, that is, where Hamilton’s 
equations of motion apply. For dissipative systems the phase space volume shrinks with time rather than 
being a constant of the motion. 
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15.3 Canonical transformations in Hamiltonian mechanics 


Hamiltonian mechanics is an especially elegant and powerful way to derive the equations of motion for com- 
plicated systems. Unfortunately, integrating the equations of motion to derive a solution can be a challenge. 
Hamilton recognized this difficulty, so he proposed using generating functions to make canonical transfor- 
mations which transform the equations into a known soluble form. Jacobi, a contemporary mathematician, 
recognized the importance of Hamilton’s pioneering developments in Hamiltonian mechanics, and therefore 
he developed a sophisticated mathematical framework for exploiting the generating function formalism in 
order to make the canonical transformations required to solve Hamilton’s equations of motion. 

In the Lagrange formulation, transforming coordinates (q;,q;) to cyclic generalized coordinates (Q;,Q;), 
simplifies finding the Euler-Lagrange equations of motion. For the Hamiltonian formulation, the concept of 
coordinate transformations is extended to include simultaneous canonical transformation of both the spatial 
coordinates q; and the conjugate momenta p; from (qi, pi) to (Qi, Pi), where both of the canonical variables 
are treated equally in the transformation. Compared to Lagrangian mechanics, Hamiltonian mechanics has 
twice as many variables which is an asset, rather than a liability, since it widens the realm of possible 
canonical transformations. 

Hamiltonian mechanics has the advantage that generating functions can be exploited to make canonical 
transformations to find solutions, which avoids having to use direct integration. Canonical transformations 
are the foundation of Hamiltonian mechanics; they underlie Hamilton-Jacobi theory and action-angle variable 
theory, both of which are powerful means for exploiting Hamiltonian mechanics to solve problems in physics 
and engineering. The concept underlying canonical transformations is that, if the equations of motion are 
simplified by using a new set of generalized variables (Q, P), compared to using the original set of variables 
(q,p), then an advantage has been gained. The solution, expressed in terms of the generalized variables 
(Q, P), can be transformed back to express the solution in terms of the original coordinates, (q, p). 

Only a specialized subset of transformations will be considered, namely canonical transformations that 
preserve the canonical form of Hamilton’s equations of motion. That is, given that the original set of variables 
(qi, pi) satisfy Hamilton’s equations 


gu CEA) e PB) 
Op 0q 
for some Hamiltonian H (q, p, t), then the transformation to coordinates ();(qx.px,t), Pi (qx, px, t) is canonical 


if, and only if, there exists a function H(Q,P,t) such that the P and Q are still governed by Hamilton’s 
equations. That is, 


(15.71) 


>.  OH(Q,P,t) + JH(Q, Pt) 
Q= — aaa TAE (15.72) 


where H(Q,P,t) plays the role of the Hamiltonian for the new variables. Note that H(Q,P,t) may be 
very different from the old Hamiltonian H(q,p,t). The invariance of the Poisson bracket to canonical 
transformations, chapter 15.2.3, provides a powerful test that the transformation is canonical. 

Hamilton’s Principle of least action, discussed in chapter 9, states that 


ta ta 
555 | L(q.atdt=6 f “ip. Hlap.0)dt=0 (15.73) 
ti ti 
Similarly, applying Hamilton's Principle of least action to the new Lagrangian £(Q, Q, t) gives 


ta ta 
6S = sf L(Q, Q, t)dt = | [P -Q—H(Q,P,t)| dt =0 (15.74) 
ty ty 


The discussion of gauge-invariant Lagrangians, chapter 9.3, showed that L and £ can be related by the total 
time derivative of a generating function F where 
dF _ 
dt 
The generating function F can be any well-behaved function with continuous second derivatives of both the 
old and new canonical variables p,q, P,Q and t. Thus the integrands of (15.73) and (15.74) are related by 


: dF 
p-4- Ha pt) =A|[P-Q-H(Q,P.1)] +> (15.76) 


Z=È (15.75) 
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where A is a possible scale transformation. A scale transformation, such as changing units, is trivial, and will 
be assumed to be absorbed into the coordinates, making A = 1. Assuming that A Æ 1 is called an extended 
canonical transformation. 


15.3.1 Generating functions 


The generating function F has to be chosen such that the transformation from the initial variables (q, p) 
to the final variables (Q,P) is a canonical transformation. The chosen generating function contributes to 
(15.76) only if it is a function of the old plus new variables. The four possible types of generating functions 
of the first kind, are F\(q, Q,t), Fa(q,P,t), F3(p,Q,t), and F4(p,P,t). These four generating functions 
lead to relatively simple canonical transformations, are shown below. 


Type 1: F = Fi (q, Qt) : 


The total time derivative of the generating function F = F; (q, Q,t) is given by 


dF(q, Qt) OF, (q, Q,t) . OF, (q, Q,t) A OF, (q, Q,t) 
i ENED A Le A DA 15. 
di a itag tae ee 
Insert equation (15.77) into equation (15.76), and assume that the trivial scale factor A = 1, then 
OF, (q, Q,t) . OF, (q, Q,t) A OF, (q, Qt) 
: A t)= |P . P, t) + — 
» Jq 4 — H(q,p,t) + 39 Q—H(Q,P,t) + ==, 
Assume that the generating function F determines the canonical variables p and P to be 
OF, (q, Q,t) OF, (q, Qt) 
= -= P =P 1 + 
p Jq JQ (15.78) 
then the terms in each square bracket cancel, leading to the required canonical transformation 
9F (q, Q,t 
H(Q.P,1) = Hla, p.t) + OA (15.79) 
Type 2: F = F,(q,P,t)-Q-P: 
The total time derivative of the generating function F = F2(q, P,t)—Q- P is given by 
dF OF» (q, P.t) . OF») (q, P,t) p A D OF (q, P,t) 
= . .P-P. P. A . 
di | ðq t+ OP ES ee oe EENI 
Insert this into equation (15.76), and assume that the trivial scale factor À = 1, then 
OF» (q, P,t) . a a OFS (q, P,t) p oF (q, P.t) 
- ——= }-q-aA th=P-Q-P. L -P- P, t) + —==— 
(» da 4 — H(q,p,t) Q Q+ ap Q H(Q,P,t) + — z 
Assume that the generating function F> determines the canonical variables p and Q to be 
OF 5 (q, Pt) OF» (q, P,t) 
= Aes = ATAN 15.81 
0q Q ðP (15.81) 
then the terms in brackets cancel, leading to the required transformation 
OF 2(q, P,t 


ot 
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Type 3: F = F3(p,Q,t)+q-p: 


The total time derivative of the generating function F = F3(p, Q,t) + q: p is given by 


dF aF; (p, Q,t) . OF 3 (p, Q,t) s . Š OF3(p, Q,t) 
ee |S A AN . ; EEA AA 15.83 
T | ap b+ — oQ Q+å p +a b| += (15.83) 
Insert this into equation (15.76), and assume that the trivial scale factor A = 1, then 
OF 3 (p, Q,t) . OF (p, Qt) A OF3(p, Q,t) 
- ——_——_| -p — H(q,p,t) = |P + | Q- H(Q, P, t) + ————. 
O Hape = [P+ RO) A- QP.) + ER, 
Assume that the generating function F3 determines the canonical variables q and P to be 
OF; (p, Q,t) OF 3(p, Q,t) 
oly P=-_22TT 15.84 
q ap JQ (15.84) 
then the terms in brackets cancel, leading to the required transformation 
oF;(p,Q,t 
HQ, P,t) = H(q,p.t) + RQD (15.85) 
Type 4: F = Fi(p,P,t)+q:p—Q-P: 
The total time derivative of the generating function F = Fy(p,P,t)+q-p—Q.-P is given by 
dF OF 4(p, P.t) . OF 4(p, Pt) e . . A e OF, (p, P.t) 
— = py. —_—— -P . -p—Q-P-Q-P — 15.86 
T | 3p b+ — P +4:p +q: P-Q Q: P| + — 5 (15.86) 
Insert this into equation (15.76), and assume that the trivial scale factor A = 1, then 
OF 1(p, P.t) . OF, (p, P,t) p OF, (p, Pt) 
— == | -p — H(q,p,t) = | == — -P — H(Q, P, t) + == 
[O Ham = [AEP q p- nop, + E 
Assume that the generating function F4 determines the canonical variables q and Q to be 
OF (p, P,t) OF, (p, P,t) 
Z2 = AA 15.87 
q ap Q JP (15.87) 
then the terms in brackets cancel, leading to the required transformation 
OF 4(p, P,t 


Note that the last three generating functions require the inclusion of additional bilinear products of 
q,p, Q, P in order for the terms to cancel to give the required result. The addition of the bilinear terms, 
ensures that the resultant generating function F is the same using any of the four generating functions 
F, Fo, F3, F4. Frequently the Fo(q,P,t) generating function is the most convenient. The four possible 
generating functions of the first kind, given above, are related by Legendre transformations. A canonical 
transformation does not have to conform to only one of the four generating functions Fk for all the degrees 
of freedom, they can be a mixture of different flavors for the different degrees of freedom. The properties of 
the generating functions are summarized in table 15.1. 


Table 15.1 Canonical transformation generating functions 
Generating function Generating function derivatives Trivial special examples 


F = (a, Q,t) i = Ba; i 30 Fi = 49; Qi = Di P; = ~q; 
F= F(q, P,t)- Q: P i ] Po=GP; Qi=q Pi=Di 
F = Fs(p,Q,t)+4-p qi = -dp Pi = — 50, Fg=pQi Q=-qG P=-pi 
| F=Fi(p,P,1)+4:p-Q-:P = — ppi Qi = FF Fa=piP; Qi=pi  Pi=-4 
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The partial derivatives of the generating functions F; determine the corresponding conjugate variables 
not explicitly included in the generating function F;. Note that, for the first trivial example F, = q;Q;, the 
old momenta become the new coordinates, Q; = p;, and vice versa, P; = —q;. This illustrates that it is 
better to name them “conjugate variables” rather than “momenta” and “coordinates”. 

In summary, Jacobi has developed a mathematical framework for finding the generating function F 
required to make a canonical transformation to a new Hamiltonian H(Q, P,t), that has a known solution. 


That is, 
OF 


When H(Q, P, t) is a constant, then a solution has been obtained. The inverse transformation for this solution 
Q(t), P(t) — q(t), p(t) now can be used to express the final solution in terms of the original variables of the 
system. 

Note the special case when H(Q, P,t) = 0, then equation 15.89 has been reduced to the Hamilton-Jacobi 
relation (15.11) 


(15.89) 


as 
H(q,p,t) + = = 0 (15.11) 


In this case, the generating function F' determines the action functional S required to solve the Hamilton- 
Jacobi equation (15.110). Since equation (15.89) has transformed the Hamiltonian H(q, p,t) > H(Q,P,t), 
for which H(Q, P,t) = 0, then the solution Q(t), P(t) for the Hamiltonian H(Q, P, t) = 0 is obtained easily. 
This approach underlies Hamilton-Jacobi theory presented in chapter 15.4. 


15.3.2 Applications of canonical transformations 


The canonical transformation procedure may appear unnecessarily complicated for solving the examples 
given in this book, but it is essential for solving the complicated systems that occur in nature. For example, 
canonical transformations can be used to transform time-dependent, (non-autonomous) Hamiltonians to 
time-independent, (autonomous) Hamiltonians for which the solutions are known. Example 15.19 describes 
such a system. Canonical transformations provide a remarkably powerful approach for solving the equations 
of motion in Hamiltonian mechanics, especially when using the Hamilton-Jacobi approach discussed in 
chapter 15.4. 


15.7 Example: The identity canonical transformation 


The identity de Fa(q,P) = q:P satisfies (15.89) if the following relations are satisfied 
pi= oe = =P;,,Q,= ee = qi, H=H. Note that the new and old coordinates are identical, hence Fo = q,P; 
generates the identity transformation qi = Qi ‚pi = P;. 


15.8 Example: The point canonical transformation 


Consider the point transformation Fa(q-P) = f(q,t)P where f(q,t) is some function of q. This 
transformation satisfies (15.89) if the following relations are satisfied Qi = oe = fildi), pi = ore = Of ha 


H=H. Point transformations correspond to point-to-point transformations of coordinates. 


15.9 Example: The exchange canonical transformation 


The identity ea ea Fi(q,Q) = q: Q satisfies (15.89) if the following relations are satisfied 
Pi = SE = =Q;, P; = = —qi, H=H That is, the coordinates and momenta have been interchanged. 


15.10 Example: Infinitessimal point canonical transformation 
Consider an infinitessimal point canonical transformation, that is infinitesimally close to a point identity. 
Folq- P,t) = q: P+eG(q, P,t) 
satisfies (15.89) if the following relations are satisfied 


OF, OG(q, P,t) 


Q= ap ute OP, 
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0G(q, P, t) 
09; 


OF» 
i= =P; 
de 09; mE 


Thus the infinitessimal changes in q; and p; are given by 


0G(a, P, t) OG(q, P, t) 2 
ô i 7 „t g i i TF = at O 
qu(q, Pst) Qi — % =p, 00, (e) 
0G(q,P,t 0G(q,P,t 
ôpi(q, pt) = Pi-pi= ae yes ae dy O(e”) 


Thus G(q,P,t) is the generator of the infinitessimal canonical transformation. 


15.11 Example: 1-D harmonic oscillator via a canonical transformation 


The classic one-dimensional harmonic oscillator provides an example of the use of canonical transforma- 
tions. Consider the Hamiltonian where w? = E then 
pP ke 
a ae ee 
This form of the Hamiltonian is a sum of two squares suggesting a canonical transformation for which 
H is cyclic in a new coordinate. A guess for a canonical transformation is of the form p = mwq cot Q which 
is of the Fı(q, Q) type where Fi equals Fi(q,Q) = mwg cot Q. Using (15.78) gives 


(p? de méw*q?) 


= OF (q, Q) = mwq cot Q 
Ou 
p — EQ) _ m ef 
oQ 2 sin? Q 


Solving for the coordinates (p,q) yields 


a = sing (a) 


p = v2mwPcosQ (b) 


Inserting these into H gives 
H =wP(cos? Q + sin? Q) = wP 


which implies that Q is a cyclic coordinate. 
The Hamiltonian is conservative, since it does not explicitly depend on time, and it equals the total energy 
since the transformation to generalized coordinates is time independent. Thus 


H =E = wP 
Since om 
Cap 
then 
Q=u+0 


Substituting Q into (a) gives the well known solution of the one-dimensional harmonic oscillator 


| 2E . 
1=N\ na sin(wt + $) 
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15.4 Hamilton-Jacobi theory 


Hamilton used the Principle of Least Action to derive the Hamilton-Jacobi relation (chapter 15.3) 


Os 

+ ae 0 (15.11) 
where q,p refer to the 1 <i < n variables q;, p; and S(q;(t1), t1, qj (t2), t2) is the action functional. Inte- 
gration of this first-order partial differential equation is non trivial which is a major handicap for practical 
exploitation of the Hamilton-Jacobi equation. This stimulated Jacobi to develop the mathematical frame- 
work for canonical transformation that are required to solve the Hamilton-Jacobi equation. Jacobi’s approach 
is to exploit generating functions for making a canonical transformation to a new Hamiltonian H(Q, P, t) 
that equals zero. 


H(a,p,t) 


H(Q,P,t) = H(q,p,t) + A =0 (15.90) 


The generating function for solving the Hamilton-Jacobi equation then equals the action functional S. 

The Hamilton-Jacobi theory is based on selecting a canonical transformation to new coordinates (Q, P, t) 
all of which are either constant, or the Q; are cyclic, which implies that the corresponding momenta P; are 
constants. In either case, a solution to the equations of motion is obtained. A remarkable feature of Hamilton- 
Jacobi theory is that the canonical transformation is completely characterized by a single generating function, 
S. The canonical equations likewise are characterized by a single Hamiltonian function, H. Moreover, the 
generating function S, and Hamiltonian function H, are linked together by equation 15.11. The underlying 
goal of Hamilton-Jacobi theory is to transform the Hamiltonian to a known form such that the canonical 
equations become directly integrable. Since this transformation depends on a single scalar function, the 
problem is reduced to solving a single partial differential equation. 


15.4.1 Time-dependent Hamiltonian 
Jacobi’s complete integral S(q;, P;, t) 


The principle underlying Jacobi’s approach to Hamilton-Jacobi theory is to provide a recipe for finding 
the generating function F = S needed to transform the Hamiltonian H(q,p,t) to the new Hamiltonian 
H(Q,P,t) using equation 15.90. When the derivatives of the transformed Hamiltonian H(Q, P,t) are zero, 
then the equations of motion become 


: 9H 
Qi = OP, 7 0 (15.91) 
OH 
È = -== ; 
50, 0 (15.92) 


and thus Q; and P; are constants of motion. The new Hamiltonian H must be related to the original 
Hamiltonian H by a canonical transformation for which 


Os 
Par 
Equations 15.91 and 15.92 are automatically satisfied if the new Hamiltonian H = 0 since then equation 
15.93 gives that the generating function S satisfies equation 15.90. 

Any of the four types of generating function can be used. Jacobi chose the type 2 generating function 
as being the most useful for many practical cases, that is, S(q;, P;,t) which is called Jacobi’s complete 
integral. 

For generating functions F; and F> the generalized momenta are derived from the action by the derivative 


_ Os 
o 04; 
Use this generalized momentum to replace p; in the Hamiltonian H, given in equation (15.93) , leads to the 


Hamilton-Jacobi equation expressed in terms of the action S. 


Os Os Os 
Aq, nj aa q + at =0 (15.94) 


H(Q,P,t) = H(q,p,t) (15.93) 


Pi (15.4) 
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The Hamilton-Jacobi equation, (15.94), can be written more compactly using tensors q and VS to designate 


(q1,--Gn) and fs, Tes ge respectively. That is 


H(q, VS,t) + A =0 (15.95) 
Equation (15.95) is a first-order partial differential equation in n + 1 variables which are the old spatial 
coordinates q; plus time t. The new momenta P; have not been specified except that they are constants 
since H = 0. 

Assume the existence of a solution of (15.95) of the form S(q;, P;,t) = S(q1,..qn; 01, -.An+1; 1) where 
the generalized momenta P; = a1, Q2,....a plus t are the n + 1 independent constants of integration in the 
transformed frame. One constant of integration is irrelevant to the solution since only partial derivatives of 
S(q;, Pi, t) with respect to q; and t are involved. Thus, if S is a solution of the first-order partial differential 
equation, then so is S + a, where a is a constant. Thus it can be assumed that one of the n+ 1 constants of 
integration is just an additive constant which can be ignored leading effectively to a solution 


S(qi, Pi, t) = S(Q1,....-Qn; 01) Qn; t) (15.96) 


where none of the n independent constants are solely additive. Such generating function solutions are called 
complete solutions of the first-order partial differential equations since all constants of integration are known. 

It is possible to assume that the n generalized momenta, P; are constants a;, where the a; are the 
constants. This allows the generalized momentum to be written as 


0S(q, a, t) 


15. 
Wa (15.97) 


Pi = 
Similarly, Hamilton’s equations of motion give the conjugate coordinate Q = 8, where 6, are constants. That 
is 


Qi = bi = = (15.98) 


The above procedure has determined the complete set of 2n constants (Q = G,P =a). It is possible to 
invert the canonical transformation to express the above solution, which is expressed in terms of Q; = 6; 
and P; = a;, back to the original coordinates, that is, q; = q;(a, 8, t) and momenta p; = p;(a, 8,t) which is 
the required solution. 


Hamilton's principle function Sy(q;,t; qoto) 


Hamilton's approach to solving the Hamilton-Jacobi equation (15.95) is to seek a canonical transformation 
from variables (p,q) at time t, to a new set of constant quantities, which may be the initial values (qo, po) 
at time t = 0. Hamilton’s principle function Sy(q;,t; goto) is the generating function for this canonical 
transformation from the variables (q,p) at time t to the initial variables (qo, po) at time to. Hamilton’s 
principle function Sy (qi, t; qoto) is directly related to Jacobi’s complete integral S(q;, Pi, t). 

Note that Sy is the generating function of a canonical transformation from the present time (q, p, t) 
variables to the initial (qo, Po, to), whereas Jacobi’s S is the generating function of a canonical transformation 
from the present (q,p,t) variables to the constant variables (Q = 8,P =a). For the Hamilton approach, 
the canonical transformation can be accomplished in two steps using S by first transforming from (q, p,t) 
at time t, to (8, œ), then transforming from (8, œ) to (qo, Po, to) . That is, this two-step process corresponds 
to 

Si(q, t; qoto) = S(q, a, t) a S(qo, a, to) (15.99) 


Hamilton’s principle function Sy (q, t; qoto) is related to Jacobi’s complete integral S(q, a, t), and it will not 
be discussed further in this book. 
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15.4.2 Time-independent Hamiltonian 


Frequently the Hamiltonian does not explicitly depend on time. For the standard Lagrangian with time- 
independent constraints and transformation, then H (q,p,t) = E which is the total energy. For this case, 
the Hamilton-Jacobi equation simplifies to give 
OS — 
A 
The integration of the time dependence is trivial, and thus the action integral for a time-independent Hamil- 
tonian equals 


-H(q,p,t) = —E (a) (15.100) 


S(q,a,t) =W (q,a) -— Ela)t (15.101) 
That is, the action integral has separated into a time independent term W (q, œ) which is called Hamilton’s 
characteristic function plus a time-dependent term —E (œ)t. Thus using equations 15.97, 15.101 gives 
that the generalized momentum is 


dW (a, a) 
Pi = 15.102 
i ( ) 
The physical significance of Hamilton’s characteristic function W (q,«) can be understood by taking the 
total time derivative ae aw 
_ (q,a 
D Ton 


Taking the time integral then gives 


a) = [E pidt = | Y pida (15.103) 


Note that this equals the abbreviated action described in chapter 9.2.3, that is W (q, œa) = So(q, a). 
Inserting the action S (q, œ) into the Hamilton-Jacobi equation (15.12) gives 


0W(q,a), 
ag" = E (a) (15.104) 


This is called the time-independent Hamilton-Jacobi equation. Usually it is convenient to have E 
equal the total energy. However, sometimes it is more convenient to exclude the kt” energy E(œp) in the 
set, in which case E = E(a1, 2, ...@-1); the Routhian exploits this feature. 

The equations of the canonical transformation expressed in terms of W (q, œ) are 

OW (q, a) JE(a), _ OW (q, a) 
Pi = t= 
Odi Oa; Oa; 

These equations show that Hamilton’s characteristic function W (q, œ) is itself the generating function of a 
time-independent canonical transformation from the old variables (q, p) to a set of new variables 


one de 


2 


H(g; 


PB; + 


(15.105) 


Qi = By + 


Table 15.2 summarizes the time-dependent and ida forms of the Hamilton-Jacobi equation. 


P, =a; (15.106) 


Table 15.2; Hamilton-Jacobi formulations 


Hamiltonian Time dependent H (q, p, t) Time independent H (q, p) 
Transformed Hamiltonian H=0 H is cyclic 
Canonical transformed variables All Q¿P; are constants of motion All P; are constants of motion 


Transformed equations of motion Qi = JP. = = 0, therefore Q; = 6; 


= OP; 
È, = — 24 — = 0, therefore P; = a, n= aa = 0, therefore P; = a; 
Generating function Jacobi’s complete integral S (q, P, t) Characteristic Function W (q, P) 
: ; : Os Os. OSs OW OW 
Hamilton-Jacobi equation Hd Dar? Bag? t)+ or = 9 BGs ae Bar) Ban )=E 
Transformation equations p= as p= 


Q2 p, Qi= B= vit + B, 


15.4. HAMILTON-JACOBI THEORY 425 


15.4.3 Separation of variables 


Exploitation of the Hamilton-Jacobi theory requires finding a suitable action function S. When the Hamil- 
tonian is time independent, then equation 15.101 shows that the time dependence of the action integral 
separates out from the dependence on the spatial variables. For many systems, the Hamilton's characteristic 
function W (q, P) separates into a simple sum of terms each of which is a function of a single variable. That 
is, 

W(q, a) = Wi (qi) + Wo(qe) +--+: Wr(dn) (15.107) 


where each function in the summation on the right depends only on a single variable. Then equation (15.100) 
reduces to 


aw  0W 
H (q1, qn; an aa =E (15.108) 


where E is the constant denoting the total energy. 
Hamilton’s characteristic function W(q,P) can be used with equations (15.101), (15.102), (15.91), 
(15.92), and (15.93) to derive 


= _ Waqa) __ OW(q, a) 
Pi = ET Qi = oP, (15.109) 
. OH . OH 
Q = 5p =0 P.= 59 =0 (15.110) 
H = H+ =H-E= (15.111) 


which has reduced the problem to a simple sum of one-dimensional first-order differential equations. 

If the it” variable is cyclic, then the Hamiltonian is not a function of q; and the it” term in Hamilton's 
characteristic function equals W; = a;q; which separates out from the summation in equation 15.107. That 
is, all cyclic variables can be factored out of W(q, a) which greatly simplifies solution of the Hamilton-Jacobi 
equation. As a consequence, the ability of the Hamilton-Jacobi method to make a canonical transformation to 
separate the system into many cyclic or independent variables, which can be solved trivially, is a remarkably 
powerful way for solving the equations of motion in Hamiltonian mechanics. 


15.12 Example: Free particle 
Consider the motion of a free particle of mass m in a force-free region. Then equation 15.93 reduces to 
Os Os Os 

WG tig ae L2 

(ar, f Og OGn 


Since no forces act, and the momentum p = VS, thus the Hamilton-Jacobi equation reduces to 


The Hamiltonian is time independent, thus equation 15.101 applies 
S(q,t) = W(q, a) — E(a)t 


Since the Hamiltonian does not explicitly depend on the coordinates (x,y,z), then the coordinates are cyclic 
and separation of the variables, 15.107, gives that the action 


S=a-r- Et (B) 
For equation B to be a solution of equation A requires that 


= 1 2 
E = 5 a (C) 
Therefore 


S=a-r- = ort (D) 
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Since 


the equation of motion and the conjugate momentum are given by 
> a 
r=Q+—t p=VS=a 
m 
Thus the Hamilton-Jacobi relation has given both the equation of motion and the linear momentum p. 


15.13 Example: Point particle in a uniform gravitational field 


The Hamiltonian is 1 
H = (02 + py + pz) + mgz 
m 


Since the system is conservative, then the Hamilton-Jacobi equation can be written in terms of Hamilton’s 


characteristic function W 
E= 1 ow : + ow i + ow ° 
—2m Ox Oy Oz 


Assuming that the variables can be separated W = X (x) + Y (y) + Z(z) leads to 


+ m9gz 


_ 2H 
- XW_, 
Py Oy y 
9Z(2) 


Thus by integration the total W equals 


x y z 
W = Í Qrdr +f aydy +f ( 2m(E — mgz) — a? — a2) dz 
To Yo 


20 


Therefore using (15.106) gives 


2 mdz 
bz = t- to = f 
zo y2m(E - mgz) — 0207 
2 zd 
Bz = constant = (x — xo) / Tasi 
zo 4/2M(E — mgz) — 0 — a2 


Oy dz 


z 
By = constant = (y — yo) f 
20 


2m(E — mgz) — a2 — a2 


If x0, yo, Zo is the position of the particle at time t = to then b, = PB, =0, and from (15.106) 


L-% = (=) (t — to) 
m 
a 
y=Y = (=) (t — to) 
2m(E — mgz) — a2 — 0% 1 
2-2) = Fa (t to) 79 toy 


This corresponds to a parabola as should be expected for this trivial example. 
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15.14 Example: One-dimensional harmonic oscillator 


As discussed in example 15.11 the Hamiltonian for the one-dimensional harmonic oscillator can be written 
as 


1 
H= a (p? + mw???) =E 


assuming it is conservative and where w = 4/ E. 


Hamilton's characteristic function W can be used where 


Inserting the generalized momentum p; into the Hamiltonian gives 


1 OWE as 2292 
aN N =E 
(ES A 


Integration of this equation gives 


That is 


Note that 


This can be integrated to give 


A a ee 
= z arcsin | gy Sp 0 
2E 
15V ma sinw (t — to) 


This is the familiar solution of the undamped harmonic oscillator. 


That is 


15.15 Example: The central force problem 


The problem of a particle acted upon by a central force occurs frequently in physics. Consider the mass m 
acted upon by a time-independent central potential energy U(r). The Hamiltonian is time independent and 
can be written in spherical coordinates as 


1 1 
H= (+ ob + maget) +00) =E 


r2 sin? 


The time-independent Hamilton-Jacobi equation is conservative, thus 


1 aw \* 1 awy? 1 aw \? 
2m Or r2 \ 00 r2 sin? 0 \ 09 


Try a separable solution for Hamilton’s characteristic function W of the form 


+U(r)=E 


W = R(r) + 0(0) + (9) 


428 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS 


The Hamilton-Jacobi equation then becomes 
OR\* 1 (00Y' 1 (0? 
+ +3 
Or r2 100 r2sin“0 \ 06 
This can be rearranged into the form 
1 |(9rRY 1 (de, gay” 
22 2 (P 
2mr” sin JE (=) +2 (3) rue} = (2) 


The left-hand side is independent of ọ whereas the right-hand side is independent of r and 0. Both sides 
must equal a constant which is set to equal —L?, that is 


OR E 1 (a0? 
Or r2 \ 00 
(F) -7 


The equation in r and 0 can be rearranged in the form 
7 ES g y a 
a 00 sin” 9 


2 
2mr? E (=) +U(r) - E 
2m 
The left-hand side is independent of 0 and the right-hand side is independent of r so both must equal a 


Or 
constant which is set to be —L? , 
1 /OR L2 
Im (=) A g A 


2 
L? 

ONA srp 

00 sin? 0 
The variables now are completely separated and, by rearrangement plus integration, one obtains 

[2 
R(r) = V am | E-— U(r) = mp2 Y 
T2 
o0) = 1/1? — —— d0 
(9) J sin? 0 


Substituting these into W = R(r) + 0(0) + ) gives 


W = u U(r saaar + fye-- AF t+ Lid 


Hamilton’s characteristic function W is the generating function from coordinates (r, 9,0, Pr, Po, Po) to new 
coordinates, which are cyclic, and new momenta that are constant and taken to be the separation constants 
EL, Lz. 


1 


1 r 


Imr? sin? 0 


+U(r)+ 


2 
Dr = OW irn U(r) - 


Or 2mr? 
OW L2 
= — = L2 = 2 
ue 90 sin? 0 
Ow 
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Similarly, using (15.109) gives the new coordinates E, L, Lz 
OW | mf dr 
K 2J /E-U(r)- 
OW — 
OL ant 2 1 
JE-U(n — 2, U(r L- z 


a L 
Br, = =] p2 (a 


sin? 6 


Batt 


These equations lead to the elliptical, parabolic, or hyperbolic orbits discussed in chapter 11. 


15.16 Example: Linearly-damped, one-dimensional, harmonic oscillator 


A canonical treatment of the linearly-damped harmonic oscillator provides an example that combines use 
of non-standard Lagrangian and Hamiltonians, a canonical transformation to an autonomous system, and 
use of Hamilton-Jacobi theory to solve this transformed system. It shows that Hamilton-Jacobi theory can be 
used to determine directly the solutions for the linearly-damped harmonic oscillator. 

Non-standard Hamiltonian: 

In chapter 3.5, the equation of motion for the linearly-damped, one-dimensional, harmonic oscillator was 
given to be 


> lä + Tå + wq] =0 (a) 


Example 10.3 showed that three non-standard Lagrangians give equation of motion a when used with the 
standard Euler-Lagrange variational equations. One of these was the Bateman[Bat31] time-dependent La- 
grangian 
: m g 
La (q, q, t) = D l? g wed’ | (b) 


This Lagrangian gave the generalized momentum to be 


_ Oly A Dt 
=g "E (c) 


: l 1 
Hə(q,p,t) = pq — La(q,4,t) =e am + ¿mude (d) 


Note that both the Lagrangian and Hamiltonian are explicitly time dependent and thus they are not 
conserved quantities. This is as expected for this dissipative system. 

Hamilton-Jacobi theory: 

The form of the non-autonomous Hamiltonian (d) suggests use of the generating function for a canonical 
transformation to an autonomous Hamiltonian, for which H is a constant of motion. 


Tt 


S(q, P,t) = Folq, P,t) = qPe 7 = QP (d) 


Then the canonical transformation gives 


Os rt 
= — =P 2 
p a” (e) 
Os re 
Q = ¿por 


Insert this canonical transformation into the above Hamiltonian leads to the transformed Hamiltonian that 
is autonomous. 
OF, P? 


AE A (f) 


H(Q, P, t)=H2(q,p, t) F 
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That is, the transformed Hamiltonian H(Q,P,t) is not explicitly time dependent, and thus is conserved. 
Expressed in the original canonical variables (q, p), the transformed Hamiltonian H(Q, P, t) 


p? r 
HIQ, P, t) =e eT + spt mi gte et 


is a constant of motion which was not readily apparent when using the original Hamiltonian. This unexpected 
result illustrates the usefulness of canonical transformations for solving dissipative systems. The Hamilton- 
Jacobi theory now can be used to solve the equations of motion for the transformed variables (Q, P) plus the 
transformed Hamiltonian H(Q, P,t). The derivative of the generating function 


eae (9) 
Use equation (g) to substitute for P in the Hamiltonian H(Q,P,t) (equation (f)), then the Hamilton- 


Jacobi method gives 
1 /AS\? T 3S mui _, as 
2m (53) o aa > 
This equation is separable as described in 15.107 and thus let 


S(Q,a,t) = W(Q,a) — at 


where œ is a separation constant. Then 
1 /ƏW mwg 
f=) ETO GA = 2 h 
E ( wy QF dQ > +t ta (h) 
To simplify the equations define the variable x as 
T = ymuwQ (i) 
then equation (h) can be written as 


(2 + Aa (a - B) =0 (5) 


where A= Z and B = aa Assume initial conditions q(0) = qo and q(0) = 0 
For this case the separation constant a > 0, therefore B > 0. Note that equation (j) is a simple 
second-order algebraic relation, the solution of which is 


The choice of the sign is irrelevant for this case and thus the positive sign is chosen. There are three possible 
cases for the solution depending on whether the square-root term is real, zero, or imaginary. 
Case 1: 4 < 1, that is, San <1 


Define C = [1 — (4)’| Then equation (k) can be integrated to give 


S = —at — = + i yV (B — C2x3Bdx (1) 


and 


= =-14 = [Se 
da | wo J (B= Oa) 


This integral gives 


sin”? (S) = Cwo (t+ 8) =wt+6 
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OO) m 


Transforming back to the original variable q gives 


where 


q(t) = Ge”? sin (wt + 0) (n) 


where G and 6 are given by the initial conditions. Equation m is identical to the solution for the underdamped 
linearly-damped linear Oer Taran given previously in equation 3.35. 
Case 2: 4=1, that is, LL =1 


2 wo 


om 
In this case C=4/ l- (4 Ta = 0 and thus equation k simplifies to 


Ar? 
S=-at- = +avB 


and 
p pee 
Oa woy B 
Therefore the solution is 
g(t) =e"? (F + Gt) (0) 


where F and G are constants given by the initial conditions. This is the solution for the critically-damped 
linearly-damped, linear oscillator given previously in equation 3.38. 
Case 3: 4 > 1, that is, oan >1 


Define a real constant D where D = (y — 1] = iC, then 


s=-w- fV (B + D?x?)dx 


Then 


aes 1 I T 
ða wo (B + D?x?) 
This last integral gives 


sinh" (7) = Duo (t+ B) =wt +0 


where 


` 2 
W = WoC = wo ( ) -1 
2mwo 


q(t) = Ge” = sinh (wt + ô) (1) 


Then the original variable gives 


This is the classic solution of the overdamped linearly-damped, linear harmonic oscillator given previously in 
equation 3.37. The canonical transformation from a non-autonomous to an autonomous system allowed use 
of Hamiltonian mechanics to solve the damped oscillator problem. 

Note that this example used Bateman’s non-standard Lagrangian, and corresponding Hamiltonian, for 
handling a dissipative linear oscillator system where the dissipation depends linearly on velocity. This non- 
standard Lagrangian led to the correct equations of motion and solutions when applied using either the 
time-dependent Lagrangian, or time-dependent Hamiltonian, and these solutions agree with those given in 
chapter 3.5 which were derived using Newtonian mechanics. 
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15.4.4 Visual representation of the action function S. 


The important role of the action integral S can be illu- 
minated by considering the case of a single point mass 
m moving in a time independent potential U(r). Then 
the action reduces to 


S(q,a,t) = W(q, a) — Et (15.112) 


Let qu = 2, G2 = Y, 93 = 2, Pı = Pz, P2 = Py, P3 = Pz: 
The momentum components are given by 


pp OMAN) (15.113) 
04; 
which corresponds to 
p=VW=VS (15.114) 


That is, the time-independent Hamilton-Jacobi equation 


iS Figure 15.2: Surfaces of constant action integral S 


1 
m IVW +U(r)= E (15.115) (dashed lines) and the corresponding particle mo- 
menta (solid lines) with arrows showing the direc- 


This implies that the particle momentum is given by tion 


the gradient of Hamilton’s characteristic function and is 
perpendicular to surfaces of constant W as illustrated in 
figure 15.2. The constant W surfaces are time dependent as given by equation (15.101). Thus, if at time 
t = 0 the equi-action surface So(q,t) = Wola, P;) = 0, then at t = 1 the same surface So(q,t) = 0 now 
coincides with the So(q,t) = E surface etc. That is, the equi-action surfaces move through space separately 
from the motion of the single point mass. 

The above pictorial representation is analogous to the situation for motion of a wavefront for electromag- 
netic waves in optics, or matter waves in quantum physics where the wave equation separates into the form 
o= doer = poer we. Hamilton’s goal was to create a unified theory for optics that was equally applica- 
ble to particle motion in classical mechanics. Thus the optical-mechanical analogy of the Hamilton-Jacobi 
theory has culminated in a universal theory that describes wave-particle duality; this was a Holy Grail of 
classical mechanics since Newton’s time. It played an important role in development of the Schródinger 
representation of quantum mechanics. 


15.4.5 Advantages of Hamilton-Jacobi theory 


Initially, only a few scientists, like Jacobi, recognized the advantages of Hamiltonian mechanics. In 1843 
Jacobi made some brilliant mathematical developments in Hamilton-Jacobi theory that greatly enhanced 
exploitation of Hamiltonian mechanics. Hamilton-Jacobi theory now serves as a foundation for contemporary 
physics, such as quantum and statistical mechanics. A major advantage of Hamilton-Jacobi theory, compared 
to other formulations of analytic mechanics, is that it provides a single, first-order partial differential equation 
for the action S, which is a function of the n generalized coordinates q and time t. The generalized momenta 
no longer appear explicitly in the Hamiltonian in equations 15.94, 15.95. Note that the generalized momentum 
do not explicitly appear in the equivalent Euler-Lagrange equations of Lagrangian mechanics, but these 
comprise a system of n second-order, partial differential equations for the time evolution of the generalized 
coordinate q. Hamilton’s equations of motion are a system of 2n first-order equations for the time evolution 
of the generalized coordinates and their conjugate momenta. 

An important advantage of the Hamilton-Jacobi theory is that it provides a formulation of classical 
mechanics in which motion of a particle can be represented by a wave. In this sense, the Hamilton-Jacobi 
equation fulfilled a long-held goal of theoretical physics, that dates back to Johann Bernoulli, of finding an 
analogy between the propagation of light and the motion of a particle. This goal motivated Hamilton to 
develop Hamiltonian mechanics. A consequence of this wave-particle analogy is that the Hamilton-Jacobi 
formalism featured prominently in the derivation of the Schrödinger equation during the development of 
quantum-wave mechanics. 
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15.5 Action-angle variables 


15.5.1 Canonical transformation 


Systems possessing periodic solutions are a ubiquitous feature in physics. The periodic motion can be either 
an oscillation, for which the trajectory in phase space is a closed loop (libration), or rolling (rotational) 
motion as discussed in chapter 3.4.4. For many problems involving periodic motion, the interest often lies in 
the frequencies of motion rather than the detailed shape of the trajectories in phase space. The action-angle 
variable approach uses a canonical transformation to action and angle variables which provide a powerful, and 
elegant method to exploit Hamiltonian mechanics. In particular, it can determine the frequencies of periodic 
motion without having to calculate the exact trajectories for the motion. This method was introduced by 
the French astronomer Ch. E. Delaunay(1816 — 1872) for applications to orbits in celestial mechanics, but 
it has equally important applications beyond celestial mechanics such as to bound solutions of the atom in 
quantum mechanics. 

The action-angle method replaces the momenta in the Hamilton-Jacobi procedure by the action phase 
integral for the closed loop (libration) trajectory in phase space defined by 


where for each cyclic variable the integral is taken over one complete period of oscillation. The cyclic variable 
I; is called the action variable where 


277 27 
The canonical variable to the action variable I is the angle variable @. Note that the name “action variable” 
is used to differentiate I from the action functional S = f Ldt which has the same units; i.e. angular 
momentum. 
The general principle underlying the use of action-angle variables is illustrated by considering one body, 
of mass m, subject to a one-dimensional bound conservative potential energy U(q). The Hamiltonian is 
given by 


2 
Pp 
H = — 15.11 
(2,4) = 3, + UC) (15.118) 
This bound system has a (q, p) phase space contour for each energy H = E. 
plq, E) =+y2m(E — U(q)) (15.119) 


For an oscillatory system the two-valued momentum of equation 15.119 is non-trivial to handle. By contrast, 


the area J = f pdq of the closed loop in phase space is a single-valued scalar quantity that depends on E 


and U(q). Moreover, Liouville’s theorem states that the area of the closed contour in phase space J = f pdq 


is invariant to canonical transformations. These facts suggest the use of a new pair of conjugate variables, 
(ġ, I), where I(E) uniquely labels the trajectory, and corresponding area, of a closed loop in phase space 
for each value of E, and the single-valued function ¢ is a corresponding angle that specifies the exact point 
along the phase-space contour as illustrated in Fig 15.3. 

For simplicity consider the linear harmonic oscillator where 


1 
U(q) = méd (15.120) 
Then the Hamiltonian, 15.118 equals 
pP 1 
H(p,q) = 3 + smug? (15.121) 
Hamilton’s equations of motion give that 
OH 
p = ri —mw*q (15.122) 
OH 
rar ee Se (15.123) 
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The solution of equations 15.122 and 15.123 is of the form 


q C cos(w(t — to)) (15.124) 
p = —mwC sinw(t — to) (15.125) 


where C, and to are integration constants. For the harmonic oscillator, 
equations 15.124 and 15.125 correspond to the usual elliptical contours 
in phase space, as illustrated in figure 15.3. 

The action-angle canonical transformation involves making the 
transform 


(q,p) > (9,1) (15.126) 


where I is defined by equation 15.117 and the angle ¢ being the cor- 
responding canonical angle. The logical approach to this canonical 
transformation for the harmonic oscillator is to define q and p in 


terms of @ and I 
21 
q = 4/— coso (15.127) 
mw 


p = V2mIwsind (15.128) 


Note that the Poisson bracket is unity 


(a, Plis) = 1 


which implies that the above transformation is canonical, and thus 
the phase space area I(E) = + f pdq is conserved. 


For this canonical transformation the transformed Hamiltonian 


H (6, 1) is 
HOt) = Omer PoP out 15499) 
4 = Tin mul )sin go HE COS = wW y 


Note that this Hamiltonian is a constant that is independent of the 


angle ¢, and thus Hamilton’s equations of motion give 
] = PUED o (15.130) 
id Figure 15.3: The potential energy 
ó = dH (9, 1) =w (15.131) V(q), (upper) and corresponding 
ol phase space (p,q) (middle) for the 
Thus we have mapped the harmonic oscillator to new coordinates harmonic oscillator at four equally 
(¢, T) where spaced total energies E. The corre- 
sponding action-angles (I ¢) result- 
I = H (9, 1) z E (15.132) ing from a canonical transformation 
w Ww of this system are shown in the lower 
og = w(t-to) (15.133) plot. 


That is, the phase space has been mapped from ellipses, with area proportional to E in the (q, p) phase 
space, to a cylindrical (¢, I) phase space where I = E are constant values that are independent of the angle, 
while ¢ increases linearly with time. Thus the variables (q,p) are periodic with modulus Ag = 2r. 


qlo +27,I) = q(¢,1) (15.134) 
p(l +27,I) = plp,1) (15.135) 
The period 7 of the periodic oscillatory motion is given simply by Ad = 27 = wr which is the well known re- 


sult for the harmonic oscillator. Note that the action-angle variable canonical transformation has determined 
the frequency of the periodic motion without solving the detailed trajectory of the motion. 
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The above example of the harmonic oscillator has shown that, for integrable periodic systems, it is 
possible to identify a canonical transformation to (¢,/) such that the Hamiltonian is independent of the 
angle @ which specifies the instantaneous location on the constant energy contour J. If the phase space 
contour is a separatrix, then it divides phase space into invariant regions containing phase-space contours 
with differing behavior. The action-angle variables are not useful for separatrix contours. For rolling motion, 
the system rotates with continuously increasing, or decreasing angle, and there is no natural boundary for the 
action angle variable since the phase space trajectory is continuous and not closed. However, the action-angle 
approach still is valid if the motion involves periodic as well as rolling motion. 

The example of the one-dimensional, one-body, harmonic oscillator can be expanded to the more general 
case for many bodies in three dimensions. This is illustrated by considering multiple periodic systems for 
which the Hamiltonian is conservative and where the equations of the canonical transformation are separable. 
The generalized momenta then can be written as 


_ OWs (di; a1, Q2, Qn) 


i= 15.136 
P 04; ( ) 
for which each p; is a function of q; and the n integration constants aj; 

Pi = Di (qi, 41, 02, -An ) (15.137) 


The momentum p; (qi, 01, 02, --@n) represents the trajectory of the system in the (q;,p;) phase space that is 
characterized by Hamilton’s characteristic function W(q, J). Combining equations 15.116, 15.136 gives 


= figa A) 
Ji = 
OG: 


dq; (15.138) 


Since q; is merely a variable of integration, each active action variable J; is a function of the n constants 
of integration in the Hamilton-Jacobi equation. Because of the independence of the separable-variable pairs 
(qi, pi), the J; form n independent functions of the a;, and hence are suitable for use as a new set of constant 
momenta. Thus the characteristic function W can be written as 


W (di, 0n; Ji, Jn) =. W3 (0j; Ji; Jn) (15.139) 


while the Hamiltonian is only a function of the momenta H (J1,....Jn) 
The generalized coordinate, conjugate to J, is known as the angle variable ¢,; which is defined by the 
transformation equation 


Ow NN ` Ow; (qj; Ji, adi) 


= 57 = a (15.140) 
The corresponding equation of motion for ¢ is given by 
bd; = oe = 270; (Ji, Jn) (15.141) 
where w;(J) are constant functions of the action variables J; with a solution 
bj = 2nwit + B; (15.142) 


that is, they are linear functions of time. The constants w; can be identified with the frequencies of the 
multiple periodic motions. 

The action-angle variables appear to be no different than a particular set of transformed coordinates. 
Their merit appears when the physical interpretation is assigned to w;. Consider the change 6¢; as the q; 
are changed infinitesimally 


aon 
do, = aD aT; oa (15.143) 


The derivative with respect to q; vanishes except for the a component of W. Thus equation 15.143 reduces 
to 
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o 
0b; = JJ; 21 (qj, J) dq; Modas) 
Therefore, the total change in ¢, as the system goes through one complete cycle is 


10) 
Ad, = De aa PP! (qj, J) dq; = 2163; (15.145) 
J 


where ar is outside the integral since the J; are constants for cyclic motion. Thus Ag; = 27 = w,;7; where 
Ti is the period for one cycle of oscillation, where the angular frequency w; is given by 

Wi 1 

a 15.146 

21 ma Ti ( ) 
Thus the frequency v associated with the periodic motion is the reciprocal of the period 7. The secret here is 
that the derivative of H with respect to the action variable J given by equation (15.141) directly determines 
the frequency of the periodic motion without the need to solve the complete equations of motion. Note that 
multiple periodic motion can be represented by a Fourier expansion of the form 


eS Y. ss ak Lae Ari tjowetjswst.-+jnwn) (15.147) 


Ji =—& j2=—0o 


Although the action-angle approach to Hamilton-Jacobi theory does not produce complete equations of 
motion, it does provide the frequency decomposition that often is the physics of interest. The reason that 
the powerful action-angle variable approach has been introduced here is that it is used extensively in celestial 
mechanics. The action-angle concept also played a key role in the development of quantum mechanics, in 
that Sommerfeld recognized that Bohr’s ad hoc assumption that angular momentum is quantized, could be 
expressed in terms of quantization of the angle variable as is mentioned in chapter 18. 


15.5.2 Adiabatic invariance of the action variables 


When the Hamiltonian depends on time it can be quite difficult to solve for the motion because it is difficult 
to find constants of motion for time-dependent systems. However, if the time dependence is sufficiently 
slow, that is, if the motion is adiabatic, then there exist dynamical variables that are almost constant which 
can be used to solve for the motion. In particular, such approximate constants are the familiar action-angle 
integrals. The adiabatic invariance of the action variables played an important role in the development of 
quantum mechanics during the 1911 Solvay Conference. This was a time when physicists were grappling with 
the concepts of quantum mechanics. Einstein used the following classical mechanics example of adiabatic 
invariance, applied to the simple pendulum, in order to illustrate the concept of adiabatic invariance of the 
action. This example demonstrates the power of using action-angle variables. 


15.17 Example: Adiabatic invariance for the simple pendulum 


Consider that the pendulum is made up of a point mass M suspended from a pivot by a light string of 
length L that is swinging freely in a vertical plane. Derive the dependence of the amplitude of the oscillations 
0, assuming 0 is small, if the string is very slowly shortened by a factor of 2, that is, assume that the change 
in length during one period of the oscillation is very small. 

The tension in the string T is given by 


ML 
T= ay ost) + T ) 


Let the pendulum angle be oscillatory 
0 = bo cos(wt + Yo) 
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Then the average mean square amplitude and velocity over one period are 


(6 


gos ([0o cos(ut + p0)]?) = & 
(0) =  ([-Gwsin(wt + pp)? = 


w6? 
2 


Since, for the simple pendulum, w? = $, then the tension in the string 


0? 3 02 
T = Mg(1 — LO) +z (8%) = Mg(1+ 2) 
Assuming that 09 is a small angle, and that the change in length —AL is very small during one period 


T, then the work done is 
2 


AW = TAL =—MgAL mgar (a) 


while the change in internal oscillator energy is 


2 
1 
A(—MgL cos) = A [mora = >) =—MgAL+ 5 MgA (E03) = —MyAL + ¿M9 AL + MgL0 AO 


(b) 
The work done must balance the increment in internal energy therefore 
30, AL 
L90A00 + —— = 0 

or ; 

LOŽA ln(0o L7) = 0 
Therefore it follows that ; 

(0o L?) = constant (c) 


or 
—3 
0 x L 4 


Thus shortening the length of the pendulum string from L to 4 adiabatically corresponds to the amplitude 
increasing by a factor 1.68. 


Consider the action-angle integral for one closed period T = on for this problem 


J = $ Pod 


f ML?0 - dt 


- wr (i) 


= nmML762w 


Mg? 02L? = constant 


where that last step is due to equation (c). 
The above example shows that the action integral J = constant, that is, it is invariant to an adiabatic 
change. In retrospect this result is as expected in that the action integral should be minimized. 
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15.6 Canonical perturbation theory 


Most examples in classical mechanics discussed so far have been capable of exact solutions. In real life, the 
majority of problems cannot be solved exactly. For example, in celestial mechanics the two-body Kepler 
problem can be solved exactly, but solution of the three-body problem is intractable. Typical systems in 
celestial mechanics are never as simple as the two-body Kepler system because of the influence of additional 
bodies. Fortunately in most cases the influence of additional bodies is sufficiently small to allow use of 
perturbation theory. That is, the restricted three-body approximation can be employed for which the system 
is reduced to considering it as an exactly solvable two-body problem, subject to a small perturbation to this 
solvable two-body system. Note that even though the change in the Hamiltonian due to the perturbing term 
may be small, the impact on the motion can be especially large near a resonance. 
Consider the Hamiltonian, subject to a time-dependent perturbation, is written as 


H(q,p,t) — Ho(q, p, t) an AH(q,p, t) 


where Ho(q,p,t) designates the unperturbed Hamiltonian and AH (q, p, t) designates the perturbing term. 
For the unperturbed system the Hamilton-Jacobi equation is given by 


ðS Os os 
Sal ox jt) + =0 15.90 
9 50 Ba, ) (15.90) 


H(Q:, P;,t) = Ho(a, 


where S(q;, P;,t) is the generating function for the canonical transformation (q, p) — (Q, P). The perturbed 
S(qi, Pi, t) remains a canonical transformation, but the transformed Hamiltonian H(Q;, P;,t) 4 0. That is, 


Os 
The equations of motion satisfied by the transformed variables now are 
. OAH 
Ò; = OP. (15.149) 
. OAH 
Pe = 
Qi 


These equations remain as difficult to solve as the full Hamiltonian. However, the perturbation technique 


assumes that AH is small, and that one can neglect the change of (Q;, P;) over the perturbing interval. 


Therefore, to a first approximation, the unperturbed values of one and ot can be used in equations 15.149. 


A detailed explanation of canonical perturbation theory is presented in chapter 12 of Goldstein[Go50)]. 


15.18 Example: Harmonic oscillator perturbation 


(a) Consider first the Hamilton-Jacobi equation for the generating function S(q,a,t) for the case of a 
single free particle subject to the Hamiltonian H = 3p?. Find the canonical transformation q = q(8,a) and 
p=p(8,a) where B and a are the transformed coordinate and momentum respectively. 

The Hamilton-Jacobi equation 


Os 


Using p= oS in the Hamiltonian H = 5p? gives 


as. 1 (as\? 

—+-={—] =0 

Ot 2\0q 
Since H does not depend on q,t explicitly, then the two terms on the left hand side of the equation can be 
set equal to —y, y respectively, where y is at most a function of p. Then the generating function is 


S=y219- y 


Set a = y2y then the generating function can be written as 


1 
S = aq — zot 


15.6. CANONICAL PERTURBATION THEORY 439 


The constant a can be identified with the new momentum P. Then the transformation equations become 


OS ƏS as 


p=5 =a Q= = 


That is 
q=P+ot 
which corresponds to motion with a uniform velocity a in the q,p system. 


(b) Consider that the Hamiltonian is perturbed by addition of potential U = L which corresponds to the 
harmonic oscillator. Then 


1 7 
H = =p? + > 
3 +3 
Consider the transformed Hamiltonian 
ðS 1 y wv é 1 2 
== H — = — 2 — — — = — = — 
OEY E O E 
Hamilton’s equations of motion 
Q= oH P= _oH 
OP oQ 
give that 
B = (B+at)t 
à = —(PB+at) 
These two equations can be solved to give 
ata=0 


which is the equation of a harmonic oscillator showing that a is harmonic of the form a = agsin(t + ô) 
where ag,6 are constants of motion. Thus 


B = —à — t = —oplcos(t + 6) + tsin(t + 9)] 
The transformation equations then give 


p = a= osin (t+ ô) 
q = B+at=-—d4= —0 cos(t + ô) 


Hence the solution for the perturbed system is harmonic, which is to be expected since the potential has a 
quadratic dependence of position. 


15.19 Example: Lindblad resonance in planetary and galactic motion 


Use of canonical perturbation theory in celestial mechanics has been exploited by Professor Alice Quillen 
and her group. They combine use of action-angle variables and Hamilton-Jacobi theory to investigate the role 
of Lindblad resonance to planetary motion, and also for stellar motion in galaxies. A Lindblad resonance 
is an orbital resonance in which the orbital period of a celestial body is a simple multiple of some forcing 
frequency. Even for very weak perturbing forces, such resonance behavior can lead to orbit capture and chaotic 
motion. 

For planetary motion the planet masses are about 1/1000 that of the central star, so the perturbations 
to Kepler orbits are small. However, Lindblad resonance for planetary motion led to Saturn’s rings which 
result from perturbations produced by the moons of Saturn that skulpt and clear dust rings. Stellar orbits in 
disk galaxies are perturbed a few percent by non axially-symmetric galactic features such as spiral arms or 
bars. Lindblad resonances perturb stellar motion and drive spiral density waves at distances from the center 
of a galactic disk where the natural frequency of the radial component of a star’s orbital velocity is close to 
the frequency of the fluctuations in the gravitational field due to passage through spiral arms or bars. If a 
stars orbital speed around a galactic center is greater than that of the part of a spiral arm through which it is 
traversing, then an inner Lindblad resonance occurs which speeds up the star’s orbital speed moving the orbit 
outwards. If the orbital speed is less than that of a spiral arm, an inner Lindblad resonance occurs causing 
inward movement of the orbit. 
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15.7 Symplectic representation 


The Hamilton’s first-order equations of motion are symmetric if the generalized and constraint force terms, 
in equation 15.9, are excluded. 

. OH . oH 

q= Op p= aq 
This stimulated attempts to treat the canonical variables (q,p) in a symmetric form using group theory. 
Some graduate textbooks in classical mechanics have adopted use of symplectic symmetry in order to unify 
the presentation of Hamiltonian mechanics. For a system of n degrees of freedom, a column matrix y is 
constructed that has 2n elements where 


Therefore the column matrix 
OH H OH H 
ES an ES Be i! pay (15.151) 
On}, 09 NJ ngj 05 


The symplectic matrix J is defined as being a 2n by 2n skew-symmetric, orthogonal matrix that is broken 
into four n x n null or unit matrices according to the scheme 


_({ 0 +1 
(1, io) ) (15.152) 


where [0] is the n-dimension null matrix, for which all elements are zero. Also [1] is the n-dimensional unit 
matrix, for which the diagonal matrix elements are unity and all off-diagonal matrix elements are zero. The 
J matrix accounts for the opposite signs used in the equations for q and p. The symplectic representation 
allows the Hamilton’s equations of motion to be written in the compact form 


) =I— (15.153) 


This textbook does not use the elegant symplectic representation since it ignores the important generalized 
forces and Lagrange multiplier forces. 


15.8 Comparison of the Lagrangian and Hamiltonian formulations 


Common features 


The discussion of Lagrangian and Hamiltonian dynamics has illustrated the power of such algebraic formu- 
lations. Both approaches are based on application of variational principles to scalar energy which gives the 
freedom to concentrate solely on active forces and to ignore internal forces. Both methods can handle many- 
body systems and exploit canonical transformations, which are impractical or impossible using the vectorial 
Newtonian mechanics. These algebraic approaches simplify the calculation of the motion for constrained 
systems by representing the vector force fields, as well as the corresponding equations of motion, in terms of 
either the Lagrangian function L(q,q,t) or the action functional S(q, p,t) which are related by the definite 
integral 

ta 

Sap) = [La and (15.1) 

1 
The Lagrangian function L(q, q,t), and the action functional S(q, p,t), are scalar functions under rotation, 
but they determine the vector force fields and the corresponding equations of motion. Thus the use of 
rotationally-invariant functions L(q,4,t) and S(q,p,t) provide a simple representation of the vector force 
fields. This is analogous to the use of scalar potential fields ¢ (q, t) to represent the electrostatic and gravita- 
tional vector force fields. Like scalar potential fields, Lagrangian and Hamiltonian mechanics represents the 
observables as derivatives of L(q,q,t) and S(q, p,t), and the absolute values of L(q,q,t) and S(q, p,t) are 
undefined; only differences in L(q, q,t) and S(q, p,t) are observable. For example, the generalized momenta 
are given by the derivatives p; = a and p; = 5. The physical significance of the least action S(q, @,t) is 
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illustrated when the canonically transformed momenta P = q, is a constant. Then the generalized momenta 
and the Hamilton-Jacobi equation, imply that the total time derivative of the action equals 


ds _ as, | 
dt Ñ dq; E 


Os 


The indefinite integral of this equation reproduces the definite integral (15.1) to within an arbitrary constant, 
i.e. 


S(q,p) = fra q,t)dt + constant (15.155) 


Lagrangian formulation: 


Consider a system with n independent generalized coordinates, plus m constraint forces that are not required 
to be known. The Lagrangian approach can reduce the system to a minimal system of s = n — m inde- 
pendent generalized coordinates leading to s = n — m second-order differential equations. By comparison, 
the Newtonian approach uses n + m unknowns. Alternatively, the Lagrange multipliers approach allows 
determination of the holonomic constraint forces resulting in s = n +m second order equations to determine 
s = n + m unknowns. The Lagrangian potential function is limited to conservative forces, but generalized 
forces can be used to handle non-conservative and non-holonomic forces. The advantage of the Lagrange 
equations of motion is that they can deal with any type of force, conservative or non-conservative, and 
they directly determine q, q rather than q, p which then requires relating p to q. The Lagrange approach is 
superior to the Hamiltonian approach if a numerical solution is required for typical undergraduate problems 
in classical mechanics. However, Hamiltonian mechanics has a clear advantage for addressing more profound 
and philosophical questions in physics. 


Hamiltonian formulation: 


For a system with n independent generalized coordinates, and m constraint forces, the Hamiltonian approach 
determines 2n first-order differential equations. In contrast to Lagrangian mechanics, where the Lagrangian 
is a function of the coordinates and their velocities, the Hamiltonian uses the variables q and p, rather 
than velocity. The Hamiltonian has twice as many independent variables as the Lagrangian which is a great 
advantage, not a disadvantage, since it broadens the realm of possible transformations that can be used to 
simplify the solutions. Hamiltonian mechanics uses the conjugate coordinates q, p, corresponding to phase 
space. This is an advantage in most branches of physics and engineering. Compared to Lagrangian mechanics, 
Hamiltonian mechanics has a significantly broader arsenal of powerful techniques that can be exploited to 
obtain an analytical solution of the integrals of the motion for complicated systems. These techniques 
include, the Poisson bracket formulation, canonical transformations, the Hamilton-Jacobi approach, the 
action-angle variables, and canonical perturbation theory. In addition, Hamiltonian dynamics also provides 
a means of determining the unknown variables for which the solution assumes a soluble form, and it is 
ideal for study of the fundamental underlying physics in applications to other fields such as quantum or 
statistical physics. However, the Hamiltonian approach endemically assumes that the system is conservative 
putting it at a disadvantage with respect to the Lagrangian approach. The appealing symmetry of the 
Hamiltonian equations, plus their ability to utilize canonical transformations, makes it the formalism of 
choice for examination of system dynamics. For example, Hamilton-Jacobi theory, action-angle variables 
and canonical perturbation theory are used extensively to solve complicated multibody orbit perturbations 
in celestial mechanics by finding a canonical transformation that transforms the perturbed Hamiltonian to 
a solved unperturbed Hamiltonian. 

The Hamiltonian formalism features prominently in quantum mechanics since there are well established 
rules for transforming the classical coordinates and momenta into linear operators used in quantum me- 
chanics. The variables q, q used in Lagrangian mechanics do not have simple analogs in quantum physics. 
As a consequence, the Poisson bracket formulation, and action-angle variables of Hamiltonian mechanics 
played a key role in development of matrix mechanics by Heisenberg, Born, and Dirac, while the Hamilton- 
Jacobi formulation played a key role in development of Schrédinger’s wave mechanics. Similarly, Hamiltonian 
mechanics is the preeminent variational approached used in statistical mechanics. 
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15.9 Summary 


This chapter has gone beyond what is normally covered in an undergraduate course in classical mechanics, 
in order to illustrate the power of the remarkable arsenal of methods available for solution of the equations of 
motion using Hamiltonian mechanics. This has included the Poisson bracket representation of Hamiltonian 
formulation of mechanics, canonical transformations, Hamilton-Jacobi theory, action-angle variables, and 
canonical perturbation theory. The purpose was to illustrate the power of variational principles in Hamil- 
tonian mechanics and how they relate to fields such as quantum mechanics. The following are the key points 
made in this chapter. 


Poisson brackets: The elegant and powerful Poisson bracket formalism of Hamiltonian mechanics was 
introduced. The Poisson bracket of any two continuous functions of generalized coordinates F(p,q) and 


G(p, q), is defined to be 
OF OG OF OG 
F = ; 
FF. Ap, = >, ( ee <) (15.13) 


i 


The fundamental Poisson brackets equal 


[qx, q1] = 0 (15.21) 
[pr pi] = 0 (15.22) 
[dr Pi] = — [pi de] = Ôr (15.23) 


The Poisson bracket is invariant to a canonical transformation from (q, p) to (Q, P). That is 


OF ðG OF OG 
LF, Glop = 2 (a = = = [F, G]op (15.32) 


There is a one-to-one correspondence between the commutator and Poisson Bracket of two independent 
functions, 


Ma — G1 Fi) = À [Fi, Gi] (15.38) 


where A is an independent constant. In particular F,G, commute of the Poisson Bracket [F;, G1] = 0. 


Poisson Bracket representation of Hamiltonian mechanics: It has been shown that the Poisson 

bracket formalism contains the Hamiltonian equations of motion and is invariant to canonical transforma- 

tions. Also this formalism extends Hamilton’s canonical equations to non-commuting canonical variables. 
Hamilton’s equations of motion can be expressed directly in terms of the Poisson brackets 


; ðH 

åk = lak, H| = TA (15.57) 
, OH 
Pk = lpx, H] = E (15.58) 


An important result is that the total time derivative of any operator is given by 


dG OG 
dt OL + [G, H] (15.45) 
Poisson brackets provide a powerful means of determining which observables are time independent and 
whether different observables can be measured simultaneously with unlimited precision. It was shown that 
the Poisson bracket is invariant to canonical transformations, which is a valuable feature for Hamiltonian 
mechanics. Poisson brackets were used to prove Liouville’s theorem which plays an important role in the use 
of Hamiltonian phase space in statistical mechanics. The Poisson bracket is equally applicable to continuous 
solutions in classical mechanics as well as discrete solutions in quantized systems. 
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Canonical transformations: A transformation between a canonical set of variables (q, p) with Hamil- 
tonian H(q,p,t) to another set of canonical variable (Q, P) with Hamiltonian H(Q, P,t) can be achieved 
using a generating functions F such that 


H(Q, P,t) = H(q,p,t) (15.89) 


t 


Possible generating functions are summarized in the following table. 


Generating function Generating function derivatives Trivial special case 
F = Fi (q, Q,t) a ¡== 30. F=44% Qi=p PR=-4 
P=FqP,t)-Q-P Qi=qw%  Pi=Pi 
F = F3(p,Q,t)+q-p F3=7Q Qi=-q4 P=-p; 
| F=Fip,P,t)+a-p—Q-P F =p:P; Qi=pi Pea, 


If the canonical transformation makes H(Q, P,t) = 0 then the conjugate variables (Q, P) are constants 
of motion. Similarly if H(Q, P, t) is a cyclic function then the corresponding P are constants of motion. 


Hamilton-Jacobi theory: Hamilton-Jacobi theory determines the generating function required to per- 
form canonical transformations that leads to a powerful method for obtaining the equations of motion for 
a system. The Hamilton-Jacobi theory uses the action function S = F> as a generating function, and the 
canonical momentum is given by 


Os 
i= 15.4 
Pi = oa, (15.4) 
This can be used to replace p; in the Hamiltonian H leading to the Hamilton-Jacobi equation 
Os Os 
A(q; —;t)+— =0 15.94 
G+ Fs (15.94) 


Solutions of the Hamilton-Jacobi equation were obtained by separation of variables. The close optical- 
mechanical analogy of the Hamilton-Jacobi theory is an important advantage of this formalism that led to 
it playing a pivotal role in the development of wave mechanics by Schródinger. 


Action-angle variables: The action-angle variables exploits a canonical transformation from (q,p) — 
(ġ, I) where 

1 1 

L= —J,= —0 pidas (15.117) 

27 27 
For periodic motion the phase-space trajectory is closed with area given by J and this area is conserved for 
the above canonical transformation. For a conserved Hamiltonian the action variable J is independent of 
the angle variable ¢. The time dependence of the angle variable ¢ directly determines the frequency of the 
periodic motion without recourse to calculation of the detailed trajectory of the periodic motion. 


Canonical perturbation theory: Canonical perturbation theory is a valuable method of handling multi- 
body interactions. The adiabatic invariance of the action-angle variables provides a powerful approach for 
exploiting canonical perturbation theory. 


Comparison of Lagrangian and Hamiltonian formulations: The remarkable power, and intellectual 
beauty, provided by use of variational principles to exploit the underlying principles of natural economy in 
nature, has had a long and rich history. It has led to profound developments in many branches of theoretical 
physics. However, it is noted that although the above algebraic formulations of classical mechanics have been 
used for over two centuries, the important limitations of these algebraic formulations to non-linear systems 
remain a challenge that still is being addressed. 

It has been shown that the Lagrangian and Hamiltonian formulations represent the vector force fields, 
and the corresponding equations of motion, in terms of the Lagrangian function L(q, q,t), or the action 
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functional $(q,p,t), which are scalars under rotation. The Lagrangian function L(q,4q,t) is related to the 
action functional S(q, p,t) by 


ta 
S(q, pt) = f L(q, 4,t)dt (15.1) 
ti 


These functions are analogous to electric potential, in that the observables are derived by taking derivatives 
of the Lagrangian function L(q, q,t) or the action functional S(q, p,t). The Lagrangian formulation is more 
convenient for deriving the equations of motion for simple mechanical systems. The Hamiltonian formulation 
has a greater arsenal of techniques for solving complicated problems plus it uses the canonical variables (q;, p;) 
which are the variables of choice for applications to quantum mechanics and statistical mechanics. 
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Workshop exercises 


1. Poisson brackets are a powerful means of elucidating when observables are constant of motion and whether 
two observables can be simultaneously measured with unlimited precision. Consider a spherically symmetric 
Hamiltonian 


for a mass m where U(r is a central potential. Use the Poisson bracket plus the time dependence to determine 
the following: 


a) Does pg commute with H and is it a constant of motion? 
2 
Does p3 + A commute with H and is it a constant of motion? 


(a) 
(b) 
(c) 

) 


(d) Does Po commute with pg and what does the result imply? 


Does pry commute with H and is it a constant of motion? 


2. Consider the Poisson brackets for angular momentum L 
(a) Show {Li,7;} = €ijrT kr, where the Levi-Cevita tensor is, 
+1 if ijk are cyclically permuted 


€ijr = 4 —1 if ijk are anti-cyclically permuted 
0 ifi=jori=korj=k 


(b) Show {L;, pj) = EijkPk - 


(c) Show {L;, Lj} = eijk Lx . The following identity may be useful: €s;k€ilm = ÓjlÓkm — Îjmôki - 
(d) Show {L;, 1?) =0. 


3. Consider the Hamiltonian of a two-dimensional harmonic oscillator, 


p? 1 2,2 2,2 
H= Fm + gm lwir? + w313) 


What condition is satisfied if L? a conserved quantity? 
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Problems 


1. Consider the motion of a particle of mass m in an isotropic harmonic oscillator potential U = $kr?and take 
the orbital plane to be the x — y plane. The Hamiltonian is then 


1 1 
H = So = z (Pa + Py) + 3h" +9") 


Introduce the three quantities 


os 1 2 2 1 2 2 
Sy = Dan P? Py) + ¿Ku y”) 
1 
S2 = ps 
2 mP Py + kay 
S3 = w(xpy — ype) 


with w = 4/ E. Use Poisson brackets to solve the following: 


a) Show that [So, S;] = 0 for i = 1,2,3 proving that (S1, S2, S3) are constants of motion. 
b) Show that 


[S1, S2] = 2053 
[So, S3] = 2051 
[S3, Sı] = 2w S 


so that (2w) t (S1, S2, 83) have the same Poisson bracket relations as the components of a 3-dimensional angular 
momentum. 
c) Show that 
S= S 


2. Assume that the transformation equations between the two sets of coordinates (q, p) and (Q, P) are 


Q 
P = 21+ q? cos p)q? sin p) 


In(1+ q? cos p) 


a) Assuming that q,p are canonical variables, i.e. [q,p] = 1, show directly from the above transformation 
equations that Q, P are canonical variables. 
b) Show that the generating function that generates this transformation between the two sets of canonical variables 
is 
F; = —[e? — 1]? tan p 


3. Consider a bound two-body system comprising a mass m in an orbit at a distance r from a mass M. The 
attractive central force binding the two-body system is 


k 
F=3f 


where k is negative. Use Poisson brackets to prove that the eccentricity vector A = px L+ uk? is a conserved 
quantity. 


4. (a) Consider the case of a single mass m where the Hamiltonian H = 3p?. Use the generating function 
S(q, P,t) to solve the Hamilton-Jacobi equation with the canonical transformation q = q(Q,P) and p = 
p(Q, P) and determine the equations relating the (q, p) variables to the transformed coordinate and momentum 
(Q, P). 

(b) If there is a perturbing Hamiltonian AH = tg, then P will not be constant. Express the transformed 
Hamiltonian H (using the transformation given above in terms of P,Q, and t). Solve for Q(t) and P(t) and 
show that the perturbed solution q[Q(t), P()], p[Q(t), P(t)] is simple harmonic. 


Chapter 16 


Analytical formulations for continuous 
systems 


16.1 Introduction 


Lagrangian and Hamiltonian mechanics have been used to determine the equations of motion for discrete sys- 
tems having a finite number of discrete variables q; where 1 <i < n. There are important classes of systems 
where it is more convenient to treat the system as being continuous. For example, the interatomic spacing in 
solids is a few 107!%m which is negligible compared with the size of typical macroscopic, three-dimensional 
solid objects. As a consequence, for wavelengths much greater than the atomic spacing in solids, it is use- 
ful to treat macroscopic crystalline lattice systems as continuous three-dimensional uniform solids, rather 
than as three-dimensional discrete lattice chains. Fluid and gas dynamics are other examples of continuous 
mechanical systems. Another important class of continuous systems involves the theory of fields, such as 
electromagnetic fields. Lagrangian and Hamiltonian mechanics of the continua extend classical mechanics 
into the advanced topic of field theory. This chapter goes beyond the scope of a typical undergraduate 
classical mechanics course in order to provide a brief glimpse of how Lagrangian and Hamiltonian mechanics 
can underlie advanced and important aspects of the mechanics of the continua, including field theory. 


16.2 The continuous uniform linear chain 


The Lagrangian for the discrete lattice chain, for longitudinal modes, is given by equation 14.76 to be 


L= 5 (má (aj 95)) (16.1) 


j=1 


where the n masses are attached in series to n+1 identical springs of length d and spring constant k. Assume 


that the spring has a uniform cross-section area A and length d .Then each spring volume element Ar = Ad 


has a mass m, that is, the volume mass density p = = or m = pAr. Chapter 16.5.3 will show that the 


Ar 
spring constant x = £4 where E is Young’s modulus, A is the cross sectional area of the chain element, and 


d 
d is the length of the element. Then the spring constant can be written as k = Bet. Therefore equation 


16.1 can be expressed as a sum over volume elements At = Ad 


1S g-i- 
sere 2 j- j 
L= PIC 5 ( 7 ) Jar (16.2) 


In the limit that n — oo and the spacing d = dx — 0, then the summation in equation 16.2 can be written 
as a volume integral where x = jd is the distance along the linear chain and the volume element Ar — 0. 
Then the Lagrangian can be written as the integral over the volume element dr rather than a summation 
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L= 5) (rë -E (EE? dr (16.3) 


The discrete-chain coordinate q(t) is assumed to be a continuous function q(x, t) for the uniform chain. Thus 
the integral form of the Lagrangian can be expressed as 


L= 3) Gi =p Ge 2) dr = [se (16.4) 


where the function £ is called the Lagrangian density defined by 


£= I Gi -E (E 2)") (16.5) 


The variable x in the Lagrangian density is not a generalized coordinate; it only serves the role of a continuous 
index played previously by the index j. For the discrete case, each value of 7 defined a different generalized 
coordinate qi. Now for each value of x there is a continuous function q(x,t) which is a function of both 
position and time. 

Lagrange’s equations of motion applied to the continuous Lagrangian in equation 16.4 gives 


over Ar. That is, 


=0 (16.6) 


This is the familiar wave equation in one dimension for a longitudinal wave on the continuous chain with a 


phase velocity 
JE 
Uphase = = (16.7) 
p 


The continuous linear chain also can exhibit transverse modes which have a Lagrangian density were the 


Young’s modulus F is replaced by the tension 7 in the chain, and p is replaced by the linear mass density u 


of the chain, leading to a phase velocity for a transverse wave Uphase = VE 3 


16.3 The Lagrangian density formulation for continuous systems 


16.3.1 One spatial dimension 


In general the Lagrangian density can be a function of q, Vq, 4 x,y,z, and t. It is of interest that Hamilton’s 
principle leads to a set of partial differential equations of motion, based on the Lagrangian density, that are 
analogous to the Lagrange equations of motion for discrete systems. When deriving the Lagrangian equations 
of motion in terms of the Lagrangian density using Hamilton’s principle, the notation is simplified if the 
system is limited to one spatial coordinate x. In addition, it is convenient to use the compact notation 
where the spatial derivative is written q! = da and the time derivative is q = a and the one-dimensional 
Lagrangian density is assumed to be a function £(q,q',q, x,t). The appearance of the derivative q! = a as 
an argument of the Lagrange density is a consequence of the continuous dependence of q on zx. In principle, 
higher-order derivatives could occur but they do not arise in most problems of physical interest. 

Assuming that the one spatial dimension is x, then Hamilton’s principle of least action can be expressed 
in terms of the Lagrangian density as 


ta t2 pre 
ôS = sf L(q, q, t)dt = | | L(q, qd, ġ, x, t)dxdt (16.8) 
tı tı Tı 


Following the same approach used in chapter 5.2, it is assumed that the stationary path for the action 
integral is described by the function q(x, t). Define a neighboring function using a parametric representation 
q(x, t;e) such that when e = 0, the extremum function q = q(x,t) yields the stationary action integral S. 
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Assume that an infinitessimal fraction e of a neighboring function y(x,t) is added to the extremum path 
a(x,t). That is, assume 


q(a,t;e) = (x,t) + en(a,t) (16.9) 
qd(x,tie) = atig = E + Aat Hia q (x,t) + en (x,t) (16.10) 
G(,tie) = ue HO = a De a Lee EE Y (16.11) 


where it is assumed that both the extremum function q(x,t) and the auxiliary function (x,t) are well 
behaved functions of x and t, with continuous first derivatives, and that n(x, t) = 0 at (a1, tı) and (x2, ta) 
because, for all possible paths, the function q(x, t;¢) must be identical with q(x, t) at the end points of the 
path, i.e. n(zı, tı) = n(x2, ta) =0. 

A parametric family of curves S(e), as a function of the admixture coefficient e, is described by the 
function 


o= f [s q(x, t; €), q' (x,t; ©), d(x, t; €), x, t)dxdt (16.12) 


Then Hamilton’s principle requires that the action integral be a stationary function value for e = 0, that is, 
S(e) is independent of e which is satisfied if 


OLOq 0£04 OL Ad e 
= L (aT TAT Og De (16.13) 


Equations 16.9, 16.10,and 16.11 give the partial differentials 


a 

Se = n(z,t) (16.14) 
/ 

oe = 7! (z,t) (16.15) 
ag ; 

Se = (x,t) (16.16) 


Integration by parts in both the x and t terms in equation 16.13, plus using the fact that n(a1,t:) = 
n(x2,t2) = 0 at both end points, yields 


2092 Og 2 9 /Ə£ ôq 
at = 16.1 
a OG Be" | x (94) ae mer) 
72 OL Od! 7 "2 9 (OL\ q 
Vega = L AE ee) 


Therefore Hamilton’s principle, equation 16.13 becomes 


o.’ le E = (Sr) 2 (57) | ne, dza =o (16.19) 


Since the auxiliary function (x,t) is arbitrary, then the integrand term in the square brackets of equation 


16.19 must equal zero. That is, 
o (OS o (Os o£ 
pon] paas pas nee 16.20 
(F) (57) a ee) 


Equation 16.20 gives the equations of motion in terms of the Lagrangian density that has been derived 
based on Hamilton’s principle. 


16.3.2 Three spatial dimensions 


Equation 16.4 expresses the Lagrangian as an integral of the Lagrangian density over a single continuous 


index q(x,t) where the Lagrangian density is a function £(q, a, 4 x,t). The derivation of the Lagrangian 
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equations of motion in terms of the Lagrangian density for three spatial dimensions involves the straightfor- 
ward addition of the y, and z coordinates. That is, in three dimensions the vector displacement is expressed 
by the vector q(x, y, z,t) and the Lagrangian density is related to the Lagrangian by integration over three 
dimensions. That is, they are related by the equation 


d 
b= | sa, Vamo (16.21) 
where, in cartesian coordinates, the volume element dr = dxdydz. The Lagrangian density is a function 
L(q, a, V -q,2,y,2,t) where the one field quantity q(x,t) has been extended to a spatial vector q (x,y, z, £) 
and the spatial derivatives q” have been transformed into V - q. Applying the method used for the one- 
dimensional spatial system, to the three-dimensional system, leads to the following set of equations of motion 


ð [o£ ð [ƏL ð [Of ð [Of oL 


where the x, y, z spatial derivatives have been written explicitly for clarity. 

Note that the equations of motion, equation 16.22, treat the spatial and time coordinates symmetrically. 
This symmetry between space and time is unchanged by multiplying the spatial and time coordinate by 
arbitrary numerical factors. This suggests the possibility of introducing a four-dimensional coordinate system 


Pu = {x, Y, 2) at} 


where the parameter a is freely chosen. Using this 4-dimensional formalism allows equation 16.22 to be 
written more compactly as 


4 
5 _ — Ss - =0 (16.23) 
Ete GE q 


As discussed in chapter 17, relativistic mechanics treats time and space symmetrically, that is, a four- 
dimensional vector q (x, y, z,t) can be used that treats time and the three spatial dimensions symmetrically 
and equally. This four-dimensional space-time formulation allows the first four terms in equation 16.22 to be 
condensed into a single term which illustrates the symmetry underlying equation 16.23. If the Lagrangian 
density is Lorentz invariant, and if œ = ic, then equation 16.23 is covariant. Thus the Lagrangian density 
formulation is ideally suited to the development of relativistically covariant descriptions of fields. 


16.4 The Hamiltonian density formulation for continuous systems 


Chapter 16.3 illustrates, in general terms, how field theory can be expressed in a Lagrangian formulation 
via use of the Lagrange density. It is equally possible to obtain a Hamiltonian formulation for continuous 
systems analogous to that obtained for discrete systems. As summarized in chapter 8, the Hamiltonian 
and Hamilton’s canonical equations of motion are related directly to the Lagrangian by use of a Legendre 
transformation. The Hamiltonian is defined as being 


H= 2 (a) -L (16.24) 


The generalized momentum is defined to be 


OL 
i= 237 16.25 
Pi = oa, (16.25) 
Equation (16.25) allows the Hamiltonian (16.24) to be written in terms of the conjugate momenta as 


where the Lagrangian has been partitioned into the terms for each of the individual coordinates, that is, 
L(qi, Git) = X; Lila, di, t). 
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In the limit that the coordinates q,p are continuous, then the summation in equation 16.26 can be 
transformed into a volume integral over the Lagrangian density £. In addition, a momentum density can be 
represented by the vector field m where 


T= (16.27) 


Then the obvious definition of the Hamiltonian density 5 is 


H = for = / (m -q-£) dr (16.28) 
where the Hamiltonian density is defined to be 
H$="r:q-£ (16.29) 


Unfortunately the Hamiltonian density formulation does not treat space and time symmetrically making 
it more difficult to develop relativistically covariant descriptions of fields. Hamilton’s principle can be used 
to derive the Hamilton equations of motion in terms of the Hamiltonian density analogous to the approach 
used to derive the Lagrangian density equations of motion. As described in Classical Mechanics 2” edition 
by Goldstein, the resultant Hamilton equations of motion for one dimension are 


05 

On (16.30) 
OH d 0% y 

OS OL 

o 7 OE (10:02) 


Note that equation 16.31 differs from that for discontinuous systems. 


16.5 Linear elastic solids 


Elasticity is a property of matter where the atomic forces in matter act to restore the shape of a solid when 
distorted due to the application of external forces. A perfectly elastic material returns to its original shape 
if the external force producing the deformation is removed. Materials are elastic when the external forces 
do not exceed the elastic limit. Above the elastic limit, solids can exhibit plastic flow and concomitant heat 
dissipation. Such non-elastic behavior in solids occurs when they are subject to strong external forces. 

The discussion of linear systems, in chapters 3 and 14, focussed on one dimensional systems, such as the 
linear chain, where the transverse rigidity of the chain was ignored. An extension of the one-dimensional 
linear chain to two-dimensional membranes, such as a drum skin, is straightforward if the membrane is thin 
enough so that the rigidity of the membrane can be ignored. Elasticity for three-dimensional solids requires 
accounting for the strong elastic forces exerted against any change in shape in addition to elastic forces 
opposing change in volume. The stiffness of solids to changes in shape, or volume, is best represented using 
the concepts of stress and strain. 

Forces in matter can be divided into two classes; (1) body forces, such as gravity, which act on each 
volume element, and (2) surface forces which are the forces that act on both sides of any infinitessimal 
surface element inside the solid. Surface forces can have components along the normal to the infinitessimal 
surface, as well as shear components in the plane of the surface element. Typically solids are elastic to both 
normal and shear components of the surface forces whereas shear forces in liquids and gases lead to fluid 
flow plus viscous forces due to energy dissipation. As described below, the forces acting on an infinitessimal 
surface element are best expressed in terms of the stress tensor, while the relative distortion of the shape, 
or volume, of the body are best expressed in terms of the strain tensor. The moduli of elasticity relate the 
ratio of the corresponding stress and strain tensors. The moduli of elasticity are constant in linear elastic 
solids and thus the stress is proportional to the strain providing that the strains do not exceed the elastic 
limit. 
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16.5.1 Stress tensor 


Consider an infinitessimal surface area dA of an arbitrary closed volume element dV inside the medium. 
The surface area element is defined as a vector dA = fidA where ñ is the outward normal to the closed 
surface that encloses the volume element. Assume that dF is the force element exerted by the outside on 
the material inside the volume element. The stress tensor T is defined as the ratio of dF and dA where the 
force vector dF is given by the inner product of the stress tensor T and the surface element vector dA. That 
is, 

dF = T-dA (16.33) 
Since both dF and dA are vectors, then equation 16.33 implies that the stress tensor must be a second-rank 
tensor as described in appendix F, that is, the stress tensor is analogous to the rotation matrix or the inertia 
tensor. Note that if dF and fdA are colinear, then the stress tensor T reduces to the conventional pressure 
P. The general stress tensor equals the momentum flux density and has the dimensions of pressure. 


16.5.2 Strain tensor 


Forces applied to a solid body can lead to translational, or rotational acceleration, in addition to changing 
the shape or volume of the body. Elastic forces do not act when an overall displacement € of an infinitessimal 
volume occurs, such as is involved in translational or rotational motion. Elastic forces act to oppose position- 
dependent differences in the displacement vector €, that is, the strain depends on the tensor product V Y £. 
For an elastic medium, the strain depends only on the applied stress and not on the prior loading history. 

Consider that the matter at the location r is subject to an elastic displacement €, and similarly at a 
displaced location r’ = r+}; dz; where x; are cartesian coordinates. The net relative displacement 
between r and r’ is given by 


dẸ?=Ņ (dz; + dé)” — > (das)? = > E ( oi + a) + Sn a dada. (16.34) 


: F dx, dzi dx; dex 
Ignoring the second order term Gun Cun equation gives that the i” component of dé, is 
1 (dg , d&r 
dé, = =| — + > ] desd 16.35 
i D manok 82) 
Define the elements of the strain tensor to be given by 
1/d£, dp 
a ESE ae, E 16.36 
ik 2 (=: at dx; ( ) 
then 
de, = Y - oindaide, (16.37) 
k 


Thus the strain tensor ø is a rank-2 tensor defined as the ratio of the strain vector € and the infinitessimal 
area vector dA. 

de =odA (16.38) 
where the component form of the rank -2 strain tensor is 


dg; dı dé, 
dzı dxa dx3 


0 => do da d (16.39) 


dt, dia dzs 
dez dz dis 
dxi dxa dx3 
The potential-energy density for linear elastic forces is quadratic in the strain components. That is, it is 
of the form a 
U = 5 JOUKOT ijl (16.40) 
ijkl 
where Cijkı is a rank-4 tensor. No preferential directions remain for a homogeneous isotropic elastic body 
which allows for two contractions, thereby reducing the potential energy density to the inner product 


1 
Sy 5 Dix (Tik)? (16.41) 
ik 
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16.5.3 Moduli of elasticity 


The modulus of elasticity of a body is defined to be the slope of the stress-strain curve and thus, in 
principle, it is a complicated rank-4 tensor that characterizes the elastic properties of a material. Thus the 
general theory of elasticity is complicated because the elastic properties depend on the orientation of the 
microscopic composition of the elastic matter. The theory simplifies considerably for homogeneous, isotropic 
linear materials below the elastic limit, where the strain is proportional to the applied stress. That is, the 
modulus of elasticity then reduces by contractions to a constant scalar value that depends on the properties 
of the matter involved. 

The potential energy density for homogeneous, isotropic, linear material, equation 16.41, can be separated 
into diagonal and off-diagonal components of the strain tensor. That is, 


U= I 2 (ox)? + ae (cin)? (16.42) 


The diagonal first term is the dilation term which corresponds to changes in the volume with no changes 
in shape. The off-diagonal second term involves the shear terms that correspond to changes of the shape of 
the body that also changes the volume. The constants A and u are Lamé’s moduli of elasticity which are 
positive. The various moduli of elasticity, corresponding to different distortions in the shape and volume of 
any solid body, can be derived from Lamé’s moduli for the material. 

The components of the elastic forces can be derived from the gradient of the elastic potential energy, 
equation 16.42 by use of Gauss’ law plus vector differential calculus. The components of the elastic force, 
derived from the strain tensor ø, can be associated with the corresponding components of the stress tensor 
T. Thus, for homogeneous isotropic linear materials, the components of the stress tensor are related to the 
strain tensor by the relation 

Dr de, dj 
Tk ear 


— | = 05; 21055 16.4 
e+e) Mg Doe + [O45 (16 3) 


where it has been assumed that oij = oji. The two moduli of elasticity A and y are material-dependent 
constants. Equation 16.43 can be written in tensor notation as 


T = \tr(o)I + 240 (16.44) 


where tr(c) is the trace of the strain tensor and I is the identity matrix. 
Equation 16.44 can be inverted to give the strain tensor components in terms of the stress tensor com- 
ponents. 


1 A 
“= e > Trek Oi; 16.45 
Tij 24 | (BA + 2p) 7 ni) ( ) 


The various moduli of elasticity relate combinations of different stress and strain tensor components. The 
following five elastic moduli are used frequently to describe elasticity in homogeneous isotropic media, and 
all are related to Lamé’s two moduli of elasticity. 

1) Young’s modulus E describes tensile elasticity which is axial stiffness of the length of a body to 
deformation along the axis of the applied tensile force. 


p= Tu - 484424) (16.46) 


O11 (A+ p) 


2) Bulk modulus B = aL defines the relative dilation or compression of a bodies volume to pressure 
applied uniformly in all directions. 
2 
B=A+ z” (16.47) 


The bulk modulus is an extension of Young’s modulus to three dimensions and typically is larger than EF. 
The inverse of the bulk modulus is called the compressibility of the material. 
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3) Shear modulus G describes the shear stiffness of a body to volume-preserving shear deformations. 
The shear strain o becomes a deformation angle given by the ratio of the displacement along the axis of the 
shear force and the perpendicular moment arm. The shear modulus G equals Lamé’s constant u. That is, 


G=u (16.48) 


4) Poisson's ratio v is the negative ratio of the transverse to axial strain. It is a measure of the volume 
conserving tendency of a body to contract in the directions perpendicular to the axis along which it is 
stretched. In terms of Lamé’s constants, Poisson’s ratio equals 


sont 
~~ 2A +p) 


Note that for a stable, isotropic elastic material, Poisson’s ratio is bounded between —1.0 < v < 0.5 to ensure 
that the B, y and A moduli have positive values. At the incompressible limit, y = 0.5, and the bulk modulus 
and Lame parameter A are infinite, that is, the compressibility is zero. Typical solids have Poisson’s ratios 
of y = 0.05 if hard and y = 0.25 if soft. 

The stiffness of elastic solids in terms of the elastic moduli of solids can be complicated due to the 
geometry and composition of solid bodies. Often it is more convenient to express the stiffness in terms of 
the spring constant « where 


(16.49) 


R= (16.50) 


The spring constant is inversely proportional to the length of the spring because the strain of the material 
is defined to be the fractional deformation, not the absolute deformation. 


16.5.4 Equations of motion in a uniform elastic media 


The divergence theorem (H.8) relates the volume integral of the divergence of T to the vector force density 


F acting on the closed surface. 
F =p TAA -[v -Tdr = [ter (16.51) 


That is, the inner product of the del operator, V, and the rank-2 stress tensor T, give the vector force 


density f. This force acting on the enclosed mass Ọ pdr, for the closed volume, leads to an acceleration E, 


Thus 


ya 
F =$TdA -|v -Tdr = pear (16.52) 
Use equation 16.44 to relate the stress tensor T to the moduli of elasticity gives 
€, €; E, 
i_ t 16. 
Poe 2 AR a "D (16:58) 


where i = 1,2,3. In general this equation is difficult to solve. However, for the simple case of a plane wave 
in the 7 = 1 direction, the problem reduces to the following three equations 


Oe are, 

Pan = (A + 2p) Oa? (16.54) 
as o? 

oE. 0? 

p a = ‘Se (16.56) 


Equation 16.54 corresponds to a longitudinal wave travelling with velocity v = (oe. Equations 


16.55, 16.56 correspond to two perpendicular transverse waves travelling with velocity v = A This il- 


lustrates the important fact that longitudinal waves travel faster than transverse waves in an elastic solid. 
Seismic waves in the Earth, generated by earthquakes, exhibit this property. Note that shearing stresses do 
not exist in ideal liquids and gases since they cannot maintain shear forces and thus p = 0. 
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16.6 Electromagnetic field theory 


16.6.1 Maxwell stress tensor 


Analytical formulations for continuous systems, developed for describing elasticity, are generally applicable 
when applied to other fields, such as the electromagnetic field. The use of the Maxwell’s stress tensor T, to 
describe momentum in the electromagnetic field, is an important example of the application of continuum 
mechanics in field theory. 

The Lorentz force can be written as 


P= [p(B+vxB)dr= [(0B+3xB)dr= | tar (16.57) 


where the force density f is defined to be 


f =(pE+ J x B) (16.58) 
Maxwell’s equations 
=V-E J=lvxB ace (16.59) 
P= €0 iio 0 aL : 
can be used to eliminate the charge and current densities in equation 16.57 
1 OE 
f =e) (V -E)E + (+vxB-0%) xB (16.60) 
Ho ot 
Vector calculus gives that 
o OE OB 
— (E x B) = — x B+ Ex 16.61 
Ge eg ee ee 
while Faraday’s law gives 
Ba VxE (16.62) 
ot 
Equation 16.62 allows equation 16.61 to be rewritten as 
OE o OB o 
— x B=+— (E x B) -Ex— =+— (E x B) +E E 16. 
ae +5, | x B) xa +5; | x B) + Ex (V x E) (16.63) 


Equation 16.63 can be inserted into equation 16.60. In addition, a term oF (V -B)B can be added since 
V - B =0 which allows equation 16.60 to be written in the symmetric form 


1 1 
f = a(V-R)E4 | (V-B)B+} (Y xB) xB eg xB (16.64) 
Ho Ho ot 
1 1 
= e(V-E)E+— (V-B)B+— (V xB) x Boe (E x B) —e9Ex(V xE) (16.65) 
Ho Ho ot 
Using the vector identity 
V (A-B)=Ax(V x B)+ Bx (V x A)+(A-V)B+(B-V)A (16.66) 
Let A= B = E, then 
vV (E?) =2Ex (V x E)+2(E-V)E (16.67) 
That is 1 
Ex (V x E) = ay (E?) —(E-V)E (16.68) 
Similarly 
1 
Bx (V x B) = ay (B°) -(B-V)B (16.69) 


Inserting equations 16.68 and 16.69 into equation 16.65 gives 


1 1 1 
f=c0 (V-B)E+(B-V)E-5ve"| + + [(V-B)B+(B-V)B UB o 
(0) 


2 (Ex B) (16.70) 
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This complicated formula can be simplified by defining the rank-2 Maxwell stress tensor T which has 


components 


1 1 1 
T;j = €0 (2:2, — fu") + = (2:8, — 582?) (16.71) 


The inner product of the del operator and the Maxwell stress tensor is a vector with j components of 
1 


1 2 p2 


1 
(V: T); = | (V - E) Ej+ (E - V) E; — jve + 


The above definition of the Maxwell stress tensor, plus the Poynting vector S == (E x B), allows the force 
density equation 16.58 to be written in the form 


Os 
f=V.T- — 16.73 
€0Ho Ot ( ) 
The divergence theorem allows the total force, acting of the volume 7, to be written in the form 


F = S (5 Ton) dt (16.74) 


f T-da—eom f Sdr (16.75) 


Note that, if the Poynting vector is time independent, then the second term in equation 16.75 is zero and the 
Maxwell stress tensor T is the force per unit area, (stress) acting on the surface. The fact that T is a rank-2 
tensor is apparent since the stress represents the ratio of the force-density vector df and the infinitessimal 
area vector da, which do not necessarily point in the same directions. 


16.6.2 Momentum in the electromagnetic field 


Chapter 7.2 showed that the electromagnetic field carries a linear momentum qA where q is the charge on a 
body and A is the electromagnetic vector potential. It is useful to use the Maxwell stress tensor to express 
the momentum density directly in terms of the electric and magnetic fields. 

Newton’s law of motion can be used to write equation equation 16.75 as 


dPmech d 
= = . — — d 1 . 
F 7 fT da omg fS T (16.76) 


where p is the total mechanical linear momentum of the volume 7. Equation 16.76 implies that the electro- 
magnetic field carries a linear momentum 


P field = colio | Sar (16.77) 


The f T-da term in equation 16.76 is the momentum per unit time flowing into the closed surface. 


In field theory it can be useful to describe the behavior in terms of the momentum flux density m. Thus 
the momentum flux density 7 fiela in the electromagnetic field is 


T field =E0 H0S (16.78) 


Then equation 16.76 implies that the total momentum flux density m = Tmech +T fiela is related to Maxwell’s 
stress tensor by 


o 
at (mech + T field) =VW.T (16.79) 


That is, like the elasticity stress tensor, the divergence of Maxwell's stress tensor T equals the rate of change 
of the total momentum density, that is, —T is the momentum flux density. 

This discussion of the Maxwell stress tensor and its relation to momentum in the electromagnetic field 
illustrates the role that analytical formulations of classical mechanics can play in field theory. 
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16.7 Ideal fluid dynamics 


The distinction between a solid and a fluid is that a fluid flows under shear stress whereas the elasticity 
of solids oppose distortion and flow. Shear stress in a fluid is opposed by dissipative viscous forces, which 
depend on velocity, as opposed to elastic solids where the shear stress is opposed by the elastic forces which 
depend on the displacement. An ideal fluid is one where the viscous forces are negligible, and thus the shear 
stress Lamé parameter y = 0. 


16.7.1 Continuity equation 


Fluid dynamics requires a different philosophical approach than that used to describe the motion of an 
ensemble of known solid bodies.The prior discussions of classical mechanics used, as variables, the coordinates 
of each member of an ensemble of particles with known masses. This approach is not viable for fluids 
which involve an enormous number of individual atoms as the fundamental bodies of the fluid. The best 
philosophical approach for describing fluid dynamics is to employ continuum mechanics using definite fixed 
volume elements dr and describe the fluid in terms of macroscopic variables of the fluid such as mass density 
p, pressure P, and fluid velocity v. 

Conservation of fluid mass requires that the rate of change of mass in a fixed volume must equal the net 
inflow of mass. 


d 
a | pdt + f pvda =0 (16.80) 


Using the divergence theorem (H2) allows this to be written as 


a (= +V. (ov)) dr =0 (16.81) 


Mass conservation must hold for any arbitrary volume, therefore the continuity equation can be written in 


the differential form 
dp 


a V- (pv) =0 (16.82) 


16.7.2 Euler’s hydrodynamic equation 


The fluid surrounding a volume 7 exerts a net force F that equals the surface integral of the pressure P. 
This force can be transformed to a volume integral of VP. The net force then will lead to an acceleration 


of the volume element. That is 
dv 
F= - $ Pda == J V Pdr = J orar (16.83) 


Thus the force density f is given by 


dv 
f = -V P =p— 16.84 
Pa (16.84) 
Note that the acceleration dí in equation 16.83 refers to the rate of change of velocity for individual 


atoms in the fluid, not the rate of change of fluid velocity at a fixed point in space. These two accelerations 
are related by noting that, during the time dt, the change in velocity dv of a given fluid particle is composed 
of two parts, namely (1) the change during dt in the velocity at a fixed point in space, and (2) the difference 
between the velocities at that same instant in time at two points displaced a distance dr apart, where dr is 
the distance moved by a given fluid particle during the time dt. The first part is given by 2 dt at a given 
point (x,y,z) in space. The second part equals 


Ov Ov Ov 
de + o + de = (dr d V) v (16.85) 
Thus 
Ov 
dv = —dt+(dr-V)v (16.86) 


~ Ot 
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Divide both sides by dt gives that the acceleration of the atoms in the fluid equals 


== : 16. 
i OL + (v-V)v (16.87) 
Substitute equation 16.87 into 16.84 gives 
1 
N y)ve-2VP (16.88) 
Ot p 


This is Euler’s equation for hydrodynamics. The two terms on the left represent the acceleration in the 
individual fluid components while the right-hand side lists the force density producing the acceleration. 

Additional forces can be added to the right-hand side. For example, the gravitational force density pg 
can be expressed in terms of the gravitational scalar potential V to be 


pg = —pVV (16.89) 
Inclusion of the gravitational field force density in Euler’s equation gives 
1 
AE E = EPA) (16.90) 
Ot p 
16.7.3 Irrotational flow and Bernoulli’s equation 


Streamlined flow corresponds to irrotational flow, that is, V x v = 0. Since irrotational flow is curl free, the 
velocity streamlines can be represented by a scalar potential field ¢. That is 


v=-Vó (16.91) 


This scalar potential field ¢ can be used to derive the vector velocity field for irrotational flow. 
Note that the (v - V) v term in Euler’s equation (16.90) can be rewritten using the vector identity 


(v- V)v=3V (v2) -vx V xv (16.92) 


Inserting equation 16.92 into Euler's equation 16.90 then gives. 


Ov 


ie 
O ae = 16. 
AE vxVxv ov (300 +P+pV) (16.93) 


Potential flow corresponds to time independent irrotational flow, that is, both dv =0and V x v =0. For 
potential flow equation 16.93 reduces to 


1 
v (sm? + P+ ov) =0 
which implies that 
1 
(30 +P+ pv) = constant (16.94) 


This is the famous Bernoulli’s equation that relates the interplay of the fluid velocity, pressure and gravita- 
tional energy. Bernoulli’s equation plays important roles in both hydrodynamics and aerodynamics. 


16.7.4. Gas flow 


Fluid dynamics applied to gases is a straightforward extension of fluid dynamics that employs standard ther- 
modynamical concepts. The following example illustrates the application of fluid mechanics for calculating 
the velocity of sound in a gas. 
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16.1 Example: Acoustic waves in a gas 


Propagation of acoustic waves in a gas provides an example of using the three-dimensional Lagrangian 
density. Only longitudinal waves occur in a gas and the velocity is given by thermodynamics of the gas. Let 
the displacement of each gas molecule be designated by the general coordinate q with corresponding velocity 
q. Let the gas density be p, then the kinetic energy density (KED) of an infinitessimal volume of gas Ar is 
given by 


Leo 
A (KED) = spo 


The rapid contractions and expansions of the gas in an acoustic wave occur adiabatically such that the product 


Y __ specific heat at constant pressure . . - 
PV” is a constant, where y = spacing Heat a constantryolne * Therefore the change in potential energy density 


A(PED) is given to second order by 


VA +AV 
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Since the volume and density are related by 


To= 


M 
Po 


then the fractional change in the density o is related to the density by 


P = poll +0) 


This implies that the potential energy density (PED) is given by 
P 
A (PED) = [Po + Zo] 


The mass flowing out of the volume Vo must equal the fractional change in density of the volume, that is 


po | a:a5==00 | odr 
[aas=[v-aar=— foar 


Thus the density o is given by minus the divergence of q 


The divergence theorem gives that 


o=-V-q 
This allows the potential energy density to be written as 


P 
A(PED) = -PV - q+ (V- q? 


Combining the kinetic energy density and the potential energy density gives the complete Lagrangian density 
for an acoustic wave in a gas to be 


lia P 
£ =P + Pov: a (V- q) 


Inserting this Lagrangian density in the corresponding equations of motion, equation 16.23, gives that 


where Po and po are the ambient pressure and density of the gas. This is the wave equation where the phase 


velocity of sound is given by 
aN HEA 
ase — 
A Po 
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16.8 Viscous fluid dynamics 


Viscous fluid dynamics is a branch of classical mechanics that plays a pivotal role in a wide range of aspects 
of life, such as blood flow in human anatomy, weather, hydraulic engineering, and transportation by land, 
sea, and air. Viscous fluid flow provides natures most common manifestation of nonlinearity and turbulence 
in classical mechanics, and provides an excellent illustration of possible solutions of non-linear equations of 
motion introduced in chapter 4. A detailed description of turbulence remains a challenging problem and 
this subject has the reputation of being the last great unsolved problem in classical mechanics. There is 
an apocryphal story that Werner Heisenberg was asked, if given the opportunity, what would he like to ask 
God. His reply was “When I meet God, I am going to ask him two questions: Why relativity? and why 
turbulence?, I really believe he will only have an answer to the first”. 

In contrast to solids, fluids do not have elastic restoring forces to support shear stress because the fluid 
flows. Shear stresses in fluids are balance by viscous forces which are velocity dependent. There are two 
mechanisms that lead to shear stress acting between adjacent fluid layers in relative motion. The first 
mechanism involves laminar flow where the viscous forces produce shear stress between adjacent layers of 
the fluid which are moving parallel along adjacent streamlines at differing velocities. Viscous forces typically 
dominate laminar flow. High viscosity fluids like honey exhibit laminar flow and are more difficult to stir 
or pour compared with low-viscosity fluids like water. The second mechanism involves turbulent flow where 
shear stress is due to momentum transfer between adjacent layers when the flow breaks up into large-scale 
coherent vortex structures which carry most of the kinetic energy. These eddies lead to transverse motion 
that transfers momentum plus heat between adjacent layers and leads to higher drag. The wing-tip vortex 
produced by the wing tip of an aircraft is an example of a dynamically-distinct, large-scale, coherent vortex 
structure which has considerable angular momentum and decays by fragmentation into a cascade of smaller 
scale structures. 


16.8.1 Navier-Stokes equation 


Viscous forces acting on the small-scale coherent structures eventually dissipate the energy in turbulent 
motion. The viscous drag can be handled in terms of a stress tensor T analogous to its use when accounting 
for the elastic restoring forces in elasticity as discussed in chapter 16.5.3. That is, the viscous force density 
is related to the deceleration of the volume element by 


2 (ov) =-V-T (16.95) 


where the components of the stress tensor are 


Note that the stress tensor gives the momentum flux density tensor, which involves a diagonal term propor- 
tional to pressure P, plus a viscous drag term that is is proportional to the product of two velocities. 

The Navier-Stokes equations are the fundamental equations characterizing fluid flow. They are based on 
application of Newton’s second law of motion to fluids together with the assumption that the fluid stress 
is the sum of a diffusing viscous term plus a pressure term. Combining Euler’s equation, 16.90, with 16.95 
gives the Navier-Stokes equation 


Ov 
p É +v: vv ==-VP +V TH (16.97) 
where p is the fluid density, v is the flow velocity vector, P the pressure, T is the shear stress tensor viscous 
drag term, and f represents external body forces per unit volume such as gravity acting on the fluid. 

For incompressible flow the stress tensor term simplifies to V - T =4V*v. Then the Navier-Stokes 
equation simplifies to 


Ot 


where V?v is the viscosity drag term. The left-hand side of equation 16.98 represents the rate of change 
of momentum per unit volume while the right-hand side represents the summation of the forces per unit 
volume that are acting. 


p É +v.: vv = -VP + uV’ v+ (16.98) 
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The Navier-Stokes equations are nonlinear due to the (v-V)v term as well as being a function of 
velocity. This non-linearity leads to a wide spectrum of dynamic behavior ranging from ordered laminar 
flow to chaotic turbulence. Numerical solution of the Navier-Stokes equations is extremely difficult because 
of the wide dynamic range of the dimensions of the coherent structures involved in turbulent motion. For 
example, simulation calculations require use of a high resolution mesh which is a challenge to the capabilities 
of current generation computers. 

The microscopic boundary condition at the interface of the solid and fluid is that the fluid molecules 
have zero average tangential velocity relative to the normal to the solid-fluid interface. This implies that 
there is a boundary layer for which there is a gradient in the tangential velocity of the fluid between the 
solid-fluid interface and the free-steam velocity. This velocity gradient produces vorticity in the fluid. When 
the viscous forces are negligible then the angular momentum in any coherent vortex structure is conserved 
leading to the vortex motion being preserved as it propagates. 


16.8.2 Reynolds number 


Fluid flow can be characterized by the Reynolds number 
Re which is a dimensionless number that is a measure 
of the ratio of the inertial forces pu?/L to viscous forces 
pv/L?. That is, 


100 


= meral forces = puL = vL (16.99) C, 
Viscous forces u n 1 


where v is the relative velocity between the free fluid 
flow and the solid surface, L is a characteristic linear 
dimension, y is the dynamic viscosity of the fluid, 7 is 
the kinematic viscosity (y = 5) and p is the density 
of the fluid. The Law of Similarity implies that at a 
given Reynolds number, for a specific shaped solid body, 
the fluid flow behaves identically independent of the size 
of the body. Thus one can use small models in wind 
tunnels, or water-flow tanks, to accurately model fluid No separation Steady separation bubble 
flow that can be scaled up to a full-sized aircraft or boats (A) 
by scaling v and L to give the same Reynolds number. 


16.8.3 Laminar and turbulent fluid flow IÓN 


Fluid flow over a cylinder illustrates the general features Oscillating Karman vortex street wake 
of fluid flow. The drag force Fp acting on a cylinder 
of diameter D and length l, with the cylindrical axis 


perpendicular to the fluid flow, is given by SOE ma 
j A 
Fp = ¿2CpDl (16.100) SE ~ 


Laminar boundary layer, Turbulent boundary layer, 
where Cp is the coefficient of drag. Figure 16.lupper A A RE E, 
shows the dependence of the drag coefficient Cp as a 
function of the Reynolds number, for fluid flow that 
is transverse to a smooth circular cylinder. The lower 
part of figure 16.1 shows the streamlines for flow around 
the cylinder at various Reynolds numbers for the points 
identified by the letters A,B,C, D, and E on the plot 
of the drag coefficient versus Reynolds number for a 
smooth cylinder. 

A) At low velocities, where Re < 1, the flow is lam- 
inar around the cylinder in that the Joe vorticity is 
damped by the viscous forces and the 2 or term in equa- 
tion 16.98 can be ignored. The coefficient of drag Cp 


(D) (E) 


Figure 16.1: Upper: The dependence of the coeffi- 
cient of drag Cp on Reynolds number Re for fluid 
flow perpendicular to a smooth circular cylinder 
of diameter D and length l. Lower: Typical flow 
patterns for flow past a circular cylinder at vari- 
ous Reynolds numbers as indicated in the upper 
figure. 
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varies inversely with Re leading to the drag forces that are roughly linear with velocity as described in chapter 
2.10.5. The size and velocities of raindrops in a light rain shower correspond to such Reynolds numbers. 


B) For 10 < Re < 30 the flow has two turbulent vortices immediately behind the body in the wake of 
the cylinder, but the flow still is primarily laminar as illustrated. 


C) For 40 < Re < 250 the pair of vortices peel off alternately producing a regular periodic sequence of 
vortices although the flow still is laminar. This vortex sheet is called a von Kármán vortex sheet for which 
the velocity at a given position, relative to the cylinder, is time dependent in contrast to the situation at 
lower Reynolds numbers. 


D) For 10% < Re < 10° viscous forces are negligible relative to the inertial effects of the vortices and 
boundary-layer vortices have less time to diffuse into the larger region of the fluid, thus the boundary layer is 
thinner. The boundary-layer flow exhibits a small scale chaotic turbulence in three dimensions superimposed 
on regular alternating vortex structures. In this range Cp is roughly constant and thus the drag forces are 
proportional to the square of the velocity. This regime of Reynold numbers corresponds to typical velocities 
of moving automobiles. 


E) For Re = 10°, which is typical of a flying aircraft, the inertial effects dominate except in the narrow 
boundary layer close to the solid-fluid interface. The chaotic region works its way further forward on the 
cylinder reducing the volume of the chaotic turbulent boundary layer which results in a significant decreases 
in Cp. For a sailplane wing flying at about 50knots, the boundary layer at the leading edge of the cylinder 
reduces to the order of a millimeter in thickness at the leading edge and a centimeter at the trailing edge. At 
these Reynold’s numbers the airflow comprises a thin boundary layer, where viscous effects are important, 
plus fluid flow in the bulk of the fluid where the vortex inertial terms dominate and viscous forces can be 
ignored. That is, the viscous stress tensor term V - T, on the right-hand side of equation 16.97, can be 
ignored, and the Navier-Stokes equation reduces to the simpler Euler equation for such inviscid fluid flow. 


The importance of the inertia of the vortices is illustrated by the persistence of the vortex structure 
and turbulence over a wide range of length scales characteristic of turbulent flow. The dynamic range of 
the dimension of coherent vortex structures is enormous. For example, in the atmosphere the vortex size 
ranges from 10°m in diameter for hurricanes down to 10~°m in thin boundary layers adjacent to an aircraft 
wing. The transition from laminar to turbulent flow is illustrated by water flow over the hull of a ship which 
involves laminar flow at the bow followed by turbulent flow behind the bow wave and at the stern of the 
ship. The broad extent of the white foam of seawater along the side and the stern of a ship illustrates the 
considerable energy dissipation produced by the turbulence. The boundary layer of a stalled aircraft wing 
is another example. At a high angle of attack, the airflow on the lower surface of the wing remains laminar, 
that is, the stream velocity profile, relative to the wing, increases smoothly from zero at the wing surface 
outwards until it meets the ambient air velocity on the outer surface of the boundary layer which is the order 
of a millimeter thick. The flow on the top surface of the wing initially is laminar before becoming turbulent 
at which point the boundary layer rapidly increases in thickness. Further back the airflow detaches from 
the wing surface and large-scale vortex structures lead to a wide boundary layer comparable in thickness to 
the chord of the wing with vortex motion that leads to the airflow reversing its direction adjacent to the 
upper surface of the wing which greatly increases drag. When the vortices begin to shed off the bounded 
surface they do so at a certain frequency which can cause vibrations that can lead to structural failure if the 
frequency of the shedding vortices is close to the resonance frequency of the structure. 


Considerable time and effort are expended by aerodynamicists and hydrodynamicists designing aircraft 
wings and ship hulls to maximize the length of laminar region of the boundary layer to minimize drag. 
When the Reynolds number is large the slightest imperfections in the shape of wing, such as a speck of 
dust, can trigger the transition from laminar to turbulent flow. The boundaries between adjacent large-scale 
coherent structures are sensitively identified in computer simulations by large divergence of the streamlines 
at any separatrix. A large positive, finite-time, Lyapunov exponent identifies divergence of the streamlines 
which occurs at a separatrix between adjacent large-scale coherent vortex structures, whereas the Lyapunov 
exponents are negative for converging streamlines within any coherent structure. Computations of turbulent 
flow often combine the use of finite-time Lyapunov exponents to identify coherent structures, plus Lagrangian 
mechanics for the equations of motion since the Lagrangian is a scalar function, it is frame independent, and 
it gives far better results for fluid motion than using Newtonian mechanics. Thus the Lagrangian approach in 
the continua is used extensively for calculations in aerodynamics, hydrodynamics, and studies of atmospheric 
phenomena such as convection, hurricanes, tornadoes, etc. 
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16.9 Summary and implications 


The goal of this chapter is to provide a glimpse into the classical mechanics of the continua which introduces 
the Lagrangian density and Hamiltonian density formulations of classical mechanics. 


Lagrangian density formulation: In three dimensional Lagrangian density £(q, “a y -q, £, y, z, t) is 
related to the Lagrangian L by taking the volume integral of the Lagrangian density. 


L= | s(a Vamo (16.21) 


Applying Hamilton’s Principle to the three-dimensional Lagrangian density leads to the following set of 
differential equations of motion 


ð [OL ð [0£ ð [OL ð [OL OL 
pS e iS 114 == 1 <= lid | iS () (16.22) 


Hamiltonian density formulation: In the limit that the coordinates q, p are continuous, then the Hamil- 
tonian density can be expressed in terms of a volume integral over the momentum density m and the La- 
grangian density £ where 


Os 
=> 16.27 
v= (16.27) 
Then the obvious definition of the Hamiltonian density 5% is 
H = fow = f 4-8) dr (16.28) 
where the Hamiltonian density is given by 
H=r:q-£ (16.29) 


These Lagrangian and Hamiltonian density formulations are of considerable importance to field theory 
and fluid mechanics. 


Linear elastic solids: The theory of continuous systems was applied to the case of linear elastic solids. 
The stress tensor T is a rank 2 tensor defined as the ratio of the force vector dF and the surface element 
vector dA. That is, the force vector is given by the inner product of the stress tensor T and the surface 
element vector dA. 

dF =T-dA (16.33) 


The strain tensor o also is a rank 2 tensor defined as the ratio of the strain vector € and infinitessimal 
area dA. 
de =a-dA (16.38) 


where the component form of the rank 2 strain tensor is 


d& d&, dE 
dx, dxa dx3 


o= dé, des dez (16.39) 
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The modulus of elasticity is defined as the slope of the stress-strain curve. For linear, homogeneous, 
elastic matter, the potential energy density U separates into diagonal and off-diagonal components of the 


strain tensor 
1 y 2 y 2 


where the constants A and u are Lamé’s moduli of elasticity which are positive. The stress tensor is related 
to the strain tensor by 


o 
Tiz am FANA st + +n (q + =) = NOig 5 Okk + 2UOij (16.43) 
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Electromagnetic field theory: The rank 2 Maxwell stress tensor T has components 


1 1 1 
Ti; = €0 (zB; = 58u?) + Mo (2:2 = 352?) (16.71) 
0 


The divergence theorem allows the total electromagnetic force, acting of the volume 7, to be written as 


d 
r= f (y : T-eono Se) dr = $ Toda—como | Sar (16.74) 


The total momentum flux density is given by 


o 
at (Tmech + T field) =VW.T (16.79) 


where the electromagnetic field momentum density is given by the Poynting vector S as T fied =E0/1pS- 
Ideal fluid dynamics: Mass conservation leads to the continuity equation 
= +V. (pv) = (16.82) 


Euler’s hydrodynamic equation gives 


1 
EV) IV (P+ pV) (16.90) 


where V is the scalar gravitational potential. If the flow is irrotational and time independent then 


1 
Gas +P+ wv) = constant (16.94) 


Viscous fluid dynamics: For incompressible flow the stress tensor term simplifies to V - T =4V*v. Then 
the Navier-Stokes equation becomes 


p É +v- vv = -VP + uV’ v+f (16.98) 
where Vv is the viscosity drag term. The left-hand side of equation 16.98 represents the rate of change 
of momentum per unit volume while the right-hand side represents the summation of the forces per unit 
volume that are acting. 

The Reynolds number is a dimensionless number that characterizes the ratio of inertial forces to viscous 
forces in a viscous medium. The evolution of flow from laminar flow to turbulent flow, with increase of 
Reynolds number, was discussed. 

The classical mechanics of continuous fields encompasses a remarkably broad range of phenomena with 
important applications to laminar and turbulent fluid flow, gravitation, electromagnetism, relativity, and 
quantum fields. 


Chapter 17 


Relativistic mechanics 


17.1 Introduction 


Newtonian mechanics incorporates the Newtonian concept of the complete separation of space and time. 
This theory reigned supreme from inception, in 1687, until November 1905 when Einstein pioneered the 
Special Theory of Relativity. Relativistic mechanics undermines the Newtonian concepts of absoluteness of 
time that is inherent to Newton’s formulation, as well as when recast in the Lagrangian and Hamiltonian 
formulations of classical mechanics. Relativistic mechanics has had a profound impact on twentieth-century 
physics and the philosophy of science. Classical mechanics is an approximation of relativistic mechanics 
that is valid for velocities much less than the velocity of light in vacuum. The term “relativity” refers to 
the fact that physical measurements are always made relative to some chosen reference frame. Naively one 
may think that the transformation between different reference frames is trivial and contains little underlying 
physics. However, Einstein showed that the results of measurements depend on the choice of coordinate 
system, which revolutionized our concept of space and time. 

Einstein’s work on relativistic mechanics comprised two major advances. The first advance is the 1905 
Special Theory of Relativity which refers to nonaccelerating frames of reference. The second major advance 
was the 1916 General Theory of Relativity which considers accelerating frames of reference and their relation 
to gravity. The Special Theory is a limiting case of the General Theory of Relativity. The mathematically 
complex General Theory of Relativity is required for describing accelerating frames, gravity, plus related 
topics like Black Holes, or extremely accurate time measurements inherent to the Global Positioning System. 
The present discussion will focus primarily on the mathematically simple Special Theory of Relativity since it 
encompasses most of the physics encountered in atomic, nuclear and high energy physics. This chapter uses 
the basic concepts of the Special Theory of Relativity to investigate the implications of extending Newtonian, 
Lagrangian and Hamiltonian formulations of classical mechanics into the relativistic domain. The Lorentz- 
invariant extended Hamiltonian and Lagrangian formalisms are introduced since they are applicable to the 
Special Theory of Relativity. The General Theory of Relativity incorporates the gravitational force as a 
geodesic phenomena in a four-dimensional Reimannian structure based on space, time, and matter. A 
superficial outline is given to the fundamental concepts and evidence that underlie the General Theory of 
Relativity. 


17.2 Galilean Invariance 


As discussed in chapter 2.3, an inertial frame is one in which Newton’s Laws of motion apply. Inertial frames 
are non-accelerating frames so that pseudo forces are not induced. All reference frames moving at constant 
velocity relative to an inertial reference, are inertial frames. Newton’s Laws of nature are the same in all 
inertial frames of reference and therefore there is no way of determining absolute motion because no inertial 
frame is preferred over any other. This is called Galilean-Newtonian invariance. Galilean invariance assumes 
that the concepts of space and time are completely separable. Time is assumed to be an absolute quantity 
that is invariant to transformations between coordinate systems in relative motion. Also the element of 
length is the same in different Galilean frames of reference. 
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Consider two coordinate systems shown in figure 17.1, where the primed frame is moving along the x 
axis of the fixed unprimed frame. A Galilean transformation implies that the following relations apply; 


Ly = £i—ot (17.1) 
a, = Za 

r3 = T3 

t = t 


Note that at any instant t, the infinitessimal units of length 
in the two systems are identical since 
Xx, x’, 


3 3 
ds? = X: dz? = y dx? = ds”? (17.2) 
i=l i=l 


These are the mathematical expression of the Newtonian idea 
of space and time. An immediate consequence of the Galilean 
transformation is that the velocity of light must differ in dif- 
ferent inertial reference frames. x x 

At the end of the 19% century physicists thought they had 
discovered a way of identifying an absolute inertial frame of 
reference, that is, it must be the frame of the medium that 
transmits light in vacuum. Maxwell’s laws of electromagnetism 
predict that electromagnetic radiation in vacuum travels at c = 
Ts = 2.998 x 10°m/s. Maxwell did not address in what 
frame of reference that this speed applied. In the nineteenth 
century all wave phenomena were transmitted by some medium, such as waves on a string, water waves, 
sound waves in air. Physicists thus envisioned that light was transmitted by some unobserved medium which 
they called the ether. This ether had mystical properties, it existed everywhere, even in outer space, and yet 
had no other observed consequences. The ether obviously should be the absolute frame of reference. 

In the 1880's, Michelson and Morley performed an experi- 
ment in Cleveland to try to detect this ether. They transmitted 


Figure 17.1: Motion of the primed frame 
along the x; axis with velocity v relative to 
the parallel unprimed frame. 


light back and forth along two perpendicular paths in an inter- C 
ferometer, shown in figure 17.2, and assumed that the earth's Mirror — 
motion about the sun led to movement through the ether. Light ; 

The time taken to travel a return trip takes longer in a source Semi transparent — 
moving medium, if the medium moves in the direction of the v 
motion, compared to travel in a stationary medium. For ex- y 
ample, you lose more time moving against a headwind than 6 


you gain travelling back with the wind. The time difference 
At, for a round trip to a distance L, between travelling in the Y 
direction of motion in the ether, versus travelling the same dis- 
tance perpendicular to the movement in the ether, is given by 


ri — 


Mirror 


Atx £ ey where v is the relative velocity of the ether and c 
is the velocity of light. Figure 17.2: The Michelson interferometer 
Interference fringes between perpendicular light beams in used for the Michelson-Morley experiment. 
an optical interferometer provides an extremely sensitive mea- Interference of the two beams of coherent 
sure of this time difference. Michelson and Morley observed no light leads to fringes that depends on the 
measurable time difference at any time during the year, that differences in phase along the two paths. 
is, the relative motion of the earth within the ether is less than 
1/6 the velocity of the earth around the sun. Their conclusion was either, that the ether was dragged along 
with the earth, or the velocity of light was dependent on the velocity of the source, but these did not jibe 
with other observations. Their disappointment at the failure of this experiment to detect evidence for an ab- 
solute inertial frame is important and confounded physicists for two decades until Einstein’s Special Theory 
of Relativity explained the result. 
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17.3 Special Theory of Relativity 


17.3.1 Einstein Postulates 


In November 1905, at the age of 26, Einstein published a seminal paper entitled ”On the electrodynamics of 
moving bodies”. He considered the relation between space and time in inertial frames of reference that are 
in relative motion. In this paper he made the following postulates. 

1) The laws of nature are the same in all inertial frames of reference. 

2) The velocity of light in vacuum is the same in all inertial frames of reference. 

Note that Einstein's first postulate, coupled with Maxwell's equations, leads to the statement that the 
velocity of light in vacuum is a universal constant. Thus the second postulate is unnecessary since it is an 
obvious consequence of the first postulate plus Maxwell's equations which are basic laws of physics. This 
second postulate explained the null result of the Michelson-Morley experiment. However, it was not this 
experimental result that led Einstein to the theory of special relativity; he deduced the Special Theory of 
Relativity from consideration of Maxwell’s equations of electromagnetism. Although Einstein’s postulates 
appear reasonable, they lead to the following surprising implications. 


17.3.2 Lorentz transformation 


Galilean invariance leads to violation of the Einstein postulate that the velocity of light is a universal con- 
stant in all frames of reference. It is necessary to assume a new transformation law that renders physical 
laws relativistically invariant. Maxwell’s equations are relativistically invariant, which led to some electro- 
magnetic phenomena that could not be explained using Galilean invariance. In 1904 Lorentz proposed a new 
transformation to replace the Galilean transformation in order to explain such electromagnetic phenomena. 
Einstein’s genius was that he derived the transformation, that had been proposed by Lorentz, directly from 
the postulates of the Special Theory of Relativity. The Lorentz transformation satisfies Einstein’s theory of 
relativity, and has been confirmed to be correct by many experiments. 
For the geometry shown in figure 17.1, the Lorentz transformations are: 


Y = y(r-vt) (17.3) 
y = y 
Lo E 

; Ux 
Salto, 

where the Lorentz y factor 
1 
y == (17.4) 


The inverse transformations are 


y (a! + vt’) (17.5) 


= , val 


The Lorentz y factor, defined above, is the key feature 
differentiating the Lorentz transformations from the Galilean 
transformation. Note that y > 1; also y > 1.0 as v — 0, and 
increases to infinity as 2 — 1 as illustrated in figure 17.3. A 
useful fact that will be used later is that for 4 << 1; 


1 uy? Az 
y>1+3(-) Limit for v << c 
Á Figure 17.3: The dependence of the Lorentz 

f 2, 
Note that for v << c then y = 1 and the Lorentz trans- eapiorona 


formation is identical to the Galilean transformation. 
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Tick! 


a 


Tock! Tock! Dar Tock! 


Figure 17.4: The observer and mirror are at rest in the left-hand frame (a). The light beam takes a time 
At = d to travel to the mirror. In the right-hand frame (b) the source and mirror are travelling at a velocity 
v relative to the observer. The light travels further in the right-hand frame of reference (b) than is the 
stationary frame (a). Since Einstein states that the velocity of light is the same in both frames of reference 
then the time interval must by larger in frame (b) since the light travels further than in (a). 


17.3.3 Time Dilation: 


Consider that a clock is fixed at x, in a moving frame and measures the time interval between two events 
in the moving frame, i.e. At, = t} — th. According to the Lorentz transformation, the times in the fixed 
frame are given by: 


vri 
t = a(s + =) (17.6) 


c 
fe arog EE 
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to — tı = 7 (t3 — th) (17.7) 


Thus the time interval is given by: 


The time between events in the rest frame of the clock, Ar = At, is called the proper time which always 
is the shortest time measured for a given event and is represented by the symbol r. That is 


At = yAt, = YAT (17.8) 


Note that the time interval for any other frame of reference, moving with respect to the clock frame, will 
show larger time intervals because y > 1.0 which implies that the fixed frame perceives that the moving 
clock is slow by the factor 7. 

The plausibility of this time dilation can be understood by looking at the simple geometry of the space 
ship example shown in Figure 17.4. Pretend that the clock in the proper frame of the space ship is based on 
the time for the light to travel to and from the mirror in the space ship. In this proper frame the light has 
the shortest distance to travel, and the proper transit time is 

AT = A (17.9) 


C 


In the fixed frame, b, the component of velocity in the direction of the mirror is yc? — v? using the Pythagorus 
theorem, assuming that the light cannot travel faster the c. Thus the transit time towards and back from 
the mirror must be 

At = —=— = yAr (17.10) 


1- (2) 


which is the predicted time dilation. 
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There are many experimental verifications of time dilation in physics. For example, a stationary muon 
has a mean lifetime of Tp = 2usec, whereas the lifetime of a fast moving muon, produced in the upper 
atmosphere by high-energy cosmic rays, was observed in 1941 to be longer and given by y7, as described in 
example 17.1. In 1972 Hafely and Keating used four accurate cesium atomic clocks to confirm time dilation. 
Two clocks were flown on regularly scheduled airlines travelling around the World, one westward and the 
other eastward. The other two clocks were used for reference. The westward moving clock was slow by 
(273 + 7)nsec compared to the predicted value of (275 + 10)nsec. The Global Positioning System of 24 
geosynchronous satellites is used for locating positions to within a few meters. It has an accuracy of a few 
nanoseconds which requires allowance for time dilation and is a daily tribute to the correctness of Einstein’s 
Theory of Relativity. 


17.3.4 Length Contraction 


The Lorentz transformation leads to a contraction of the apparent length of an object in a moving frame 
as seen from a fixed frame. The length of a ruler in its own frame of reference is called the proper length. 
Consider an accurately measured rod of known proper length Lp = x% — xi that is, at rest in the moving 
primed frame. The locations of both ends of this rod are measured at a given time in the stationary frame, 
tı = ta, by taking a photograph of the moving rod. The corresponding locations in the moving frame are: 


Lo = Y(xL2—vt2) (17.11) 


vi, = ty (zı = uty) 


Since t2 = tı, the measured lengths in the two frames are related by: 


29 — £1 = 7 (£2 — 21) (17.12) 
That is, the lengths are related by: 
1 
L= ae (17.13) 


Note that the moving rod appears shorter in the direction of motion. As v — c the apparent length 
shrinks to zero in the direction of motion while the dimensions perpendicular to the direction of motion are 
unchanged. This is called the Lorentz contraction. If you could ride your bicycle at close to the speed of 
light, you would observe that stationary cars, buildings, people, all would appear to be squeezed thin along 
the direction that you are travelling. Also objects that are further away down any side street would be 
distorted in the direction of travel. A photograph taken by a stationary observer would show the moving 
bicycle to be Lorentz contracted along the direction of travel and the stationary objects would be normal. 


17.3.5 Simultaneity 


The Lorentz transformations imply a new philosophy of space and time. A surprising consequence is that 
the concept of simultaneity is frame dependent in contrast to the prediction of Newtonian mechanics. 

Consider that two events occur in frame S at (#1, ¢1) and (a2, t2). In frame S” these two events occur at 
(2,81) and (x5,t4). From the Lorentz transformation the time difference is 


n- =| ta) Wea =| (17.14) 


If an event is simultaneous in frame S, that is (t2 — t1) = 0 then 


v (zı — x 
t-t =y [e] (17.15) 
Thus the event is not simultaneous in frame 5” if (12 — 21) = Lp 4 0. That is, an event that is simultaneous 
in one frame is not simultaneous in the other frame if the events are spatially separated. The equivalent 
statement is that for two clocks, spatially separated by a distance Lp, which are synchronized in their rest 
frame, then in a moving frame they are not simultaneous. 
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Figure 17.5: If lightning strikes the front and rear of the carriage simultaneously, according to the man in 
the fixed frame, then the woman in the moving frame sees the flash from the front first since she is moving 
towards that approaching wavefront during the transit time of the light. Thus if the length of the carriage 
in the stationary frame is (£2 — 11) = Lp then the time difference is At! = yL,%. 


Einstein discussed the example shown in figure 17.5, where lightning strikes both ends of a train simul- 
taneously in the stationary earth frame of reference. A woman on the train will see that the strikes are 
not simultaneous since the wavefront from the front of the carriage will be seen first because she is moving 
forward during the time the light from the two lightning flashes is travelling towards her. As a consequence 
she observes that the two lightning flashes are not simultaneous. This explains why measurement of the 
length of a moving rod, performed by simultaneously locating both ends in the fixed frame, implies that the 
measurement occurs at different times for both ends in the moving frame resulting in a shorter apparent 
length. The lack of simultaneity explains why one can get the apparent inconsistency that the moving bicy- 
clist sees that the stationary street block to be length contracted, while in contrast, a pedestrian sees that 
the bicycle is length contracted. 

The concept of causality breaks down since (15 — x/,) can be either positive or negative, therefore the 
corresponding At can be positive of negative. A consequence of the lack of simultaneity is that the image 
shown by a photograph of a rapidly moving object is not a true representation of the moving object. Not 
only is the body contracted in the direction of travel, but also it appears distorted because light arriving from 
the far side of the body had to be emitted earlier, that is, when the body was at an earlier location, in order 
to reach the observer simultaneously with light from the near side. The relativistic snake paradox, addressed 
in Chapter 17 workshop exercise 1, is an example of the role of simultaneity in relativistic mechanics. 


17.1 Example: Muon lifetime 


Many people had trouble comprehending time dilation and Lorentz contraction predicted by the Special 
Theory of Relativity. The predictions appear to be crazy, but there are many examples where time dilation 
and Lorentz contraction are observed experimentally such as the decay in flight of the muon. At rest, the 
muon decays with a mean lifetime of 2 usec. Muons are created high in the atmosphere due to cosmic ray 
bombardment. A typical muon travels at v = 0.998c which corresponds to y = 15. Time dilation implies 
that the lifetime of the moving muon in the earth’s frame of reference is 30 usec. The speed of the muon is 
essentially c in both frames of reference, and it would travel 600m in 2 us and 9000m in 30 us. In fact, 
it is observed that the muon does travel, on average, 9000m in the earth frame of reference before decaying. 
Is this inconsistent with the view of someone travelling with the muon? In the muon’s moving frame, the 
lifetime is only 2 us, but the Lorentz contraction of distance means that 9000m in the earth frame appears 
to be only 600m in the muon moving frame; a distance it travels is 2 usec. Thus in both frames of reference 
we have consistent explanations, that is, the muon travels the height of the mountain in one lifetime. 
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17.2 Example: Relativistic Doppler Effect 


The relativistic Doppler effect is encountered frequently in physics and astronomy. Consider monochro- 
matic electromagnetic radiation from a source, such as a star, that is moving towards the detector at a 
velocity v. During the time At in the frame of the receiver, the source emits n cycles of the sinusoidal 
waveform. Thus the length of this waveform, as seen by the receiver, is nA which equals 


nà = (c— v)At 
The frequency as measured by the receiver is 
c cn 
V = — = — 
à (e—v)At 


According to the source, it emits n waves of frequency vo during the proper time interval At’, that is 
n= v At! 


This proper time interval At', in the source frame, corresponds to a time interval At in the receiver frame 
where 


At = yAt’ 


Thus the frequency measured by the receiver is 


E 1 vo yi- =, PB 
o ies NE Case 


where B = 2. This formula for source and receiver approaching each other also gives the correct answer for 
source and receiver receding if the sign of B is changed. 


This relativistic Doppler Effect accounts for the red shift observed for light emitted by receding stars and 
galaxies, as well as many examples in atomic and nuclear physics involving moving sources of electromagnetic 
radiation. 


17.3 Example: Twin paradox 


A problem that troubled physicists for many years is called the twin parador. Consider two identical 
twins, Jack and Jill. Assume that Jill travels in a space ship at a speed of y = 4 for 20 years, as measured 
by Jack’s clock, and then returns taking another 20 years, according to Jack. Thus, Jack has aged 40 years 
by the time his twin sister returns home. However, Jill’s clock measures 20/4 = 5 years for each half of the 
trip so that she thinks she travelled for 10 years total time according to her clock. Thus she has aged only 10 
years on the trip, that is, now she is 30 years younger that her twin brother. Note that, according to Jill, the 
distance she travelled out and back was 1/4 the distance according to Jack, so she perceives no inconsistency 
in her clock, and the speed of the space ship. This was called a paradox because some people claimed that 
Jill will perceive that the earth and Jack moved away at the same relative speed in the opposite direction and 
thus according to Jill, Jack should be 30 years younger, not her. Moreover, some claimed that this problem 
is symmetric and therefore both twins must still be the same age since there is no way of telling who was 
moving away from whom. This argument is incorrect because Jill was able to sense that she accelerated to 
y = 4 which destroys the symmetry argument. The effect is observed with accelerated beams of unstable 
nuclei such as the muon and was confirmed by the results of the experiment where cesium atomic clocks were 
flown around the Earth. Thus the Twin paradox is not a paradoz; the fact is that Jill will be younger than 
her twin brother. 
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17.4 Relativistic kinematics 


17.4.1 Velocity transformations 


Consider the two parallel coordinate frames with the primed frame moving at a velocity v along the x) axis 
as shown in figure 17.1. Velocities of an object measured in both frames are defined to be 


dx; 


» _ dr 
“i T 


Using the Lorentz transformations 17.3, 17.5 between the two frames moving with relative velocity v along 
the xı axis, gives that the velocity along the x axis is 


dx, dx, — vdt - 
ui = md = Tı zt = ur a (17.17) 
d! dt- Hdi 1- #7 
Similarly we get the velocities along the perpendicular x5 and x5 axes to be 
dx, u 
/ 2 2 
< de ~ 1— ee 
dx, u 
Ly az pee 3 
“a “Ge Dee 


When 47 — 0 these velocity transformations become the usual Galilean relations for velocity addition. 
Do not confuse u and u’ with v; that is, u and u’ are the velocities of some object measured in the unprimed 
and primed frames of reference respectively, whereas v is the relative velocity of the origin of one frame with 


respect to the origin of the other frame. 


17.4.2 Momentum 


Using the classical definition of momentum, that is p =mu, the linear momentum is not conserved using the 
above relativistic velocity transformations if the mass m is a scalar quantity. This problem originates from 
dx 


the fact that both x and t have non-trivial transformations and thus u =% is frame dependent. 


Linear momentum conservation can be retained by redefining momentum in a form that is identical in 
all frames of reference, that is by referring to the proper time T as measured in the rest frame of the moving 
object. Therefore we define relativistic linear momentum as 


dx dx dt 
pam =ma (17.19) 
But we know the time dilation relation 
d 
die ye (17.20) 


Note that the y,, in this relation refers to the velocity u between the moving object and the frame; this is 


quite different from the y = ea which refers to the transformation between the two frames of reference. 
a 


Thus the new relativistic definition of momentum is 


dx dx 
=m>— = —= u 17.21 
PSM = My = YM ( ) 
The relativistic definition of linear momentum is the same as the classical definition with the rest mass 


m replaced by the relativistic mass ym.! 


¡Note that, until recently, the rest mass was denoted by mo and the relativistic mass was referred to as m. Modern texts 
denote the rest mass by m and the relativistic mass by ym. This book follows the modern nomenclature for rest mass to avoid 
confusion. 
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17.4.3 Center of momentum coordinate system 


The classical relations for handling the kinematics of colliding objects, carry over to special relativity when the 
relativistic definition of linear momentum, equation 17.21, is assumed. That is, one can continue to apply 
conservation of linear momentum. However, there is one important conceptual difference for relativistic 
dynamics in that the center of mass no longer is a meaningful concept due to the interrelation of mass 
and energy. However, this problem is eliminated by considering the center of momentum coordinate system 
which, as in the non-relativistic case, is the frame where the total linear momentum of the system is zero. 
Using the concept of center of momentum incorporates the formalism of classical non-relativistic kinematics. 


17.4.4 Force 


Newton’s second law F =% is covariant under a Galilean transformation. In special relativity this definition 
dt y 


also applies using the relativistic definition of momentum p. The fact that the relativistic momentum p 
is conserved in the force-free situation, leads naturally to using the definition of force to be 


dp 
F=— 17.22 
a (17.22) 
Then the relativistic momentum is conserved if F =0. 
17.4.5 Energy 
The classical definition of work done is defined by 
2 
Wie = i F.dr =T> = Ti (17.23) 
1 
Assume T; = 0, let dr = udt and insert the relativistic force relation in equation 17.23, gives 
t d u 
W=T= f — (yumu) -udt = m f ud (y u) (17.24) 
o dt 0 
Integrate by parts, followed by algebraic manipulation, gives 
u 2 
T = ymu’? m f ee = ym +me/1 — = -me 
Oi ays ee c 
2 2 2 
ep A e A ( - =) -mÊ = me? (y, — 1) (17.25) 
2 2 C 
l-54 yl- 
Define the rest energy Eo 
Eo = me? (17.26) 
and total relativistic energy E 
E=y7,me (17.27) 
then equation 17.25 can be written as 
E =T + Eo = yy me? (17.28) 


This is the famous Einstein relativistic energy that relates the equivalence of mass and energy. The total 
relativistic energy E is a conserved quantity in nature. It is an extension of the conservation of energy and 
manifestations of the equivalence of energy and mass occur extensively in the real world. 

In nuclear physics we often convert mass to energy and back again to mass. For example, gamma 
rays with energies greater than 1.022MeV, which are pure electromagnetic energy, can be converted to an 
electron plus positron both of which have rest mass. The positron can then annihilate a different electron in 
another atom resulting in emission of two 511keV gamma rays in back to back directions to conserve linear 
momentum. A dramatic example of Einstein’s equation is a nuclear reactor. One gram of material, the mass 
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of a paper clip, provides E = 9 x 10t3joules. This is the daily output of a 1GWatt nuclear power station or 
the explosive power of the Nagasaki or Hiroshima bombs. 

As the velocity of a particle v approaches c, then y and the relativistic mass ym both approach infinity. 
This means that the force needed to accelerate the mass also approaches infinity, and thus no particle can 
exceed the velocity of light. The energy continues to increase not by increasing the velocity but by increase 
of the relativistic mass. Although the relativistic relation for kinetic energy is quite different from the 
Newtonian relation, the Newtonian form is obtained for the case of u << c in that 

u? 1 2 1 u? 2 1 


T = me (1 - 3)? -me = me (1+ 32 +) mc” = ¿me (17.29) 
c 


An especially useful relativistic relation that can be derived from the above is 
E? =p? +E? (17.30) 
This is useful because it provides a simple relation between total energy of a particle and its relativistic 
linear momentum plus rest energy. 
17.4 Example: Rocket propulsion 


Consider a rocket, having initial mass M, is accelerated in a straight line in free space by exhausting 
propellant at a constant speed vp relative to the rocket. Let u be the speed of the rocket relative to it’s initial 
rest frame S, when its rest mass has decreased to m. At this instant the rocket is at rest in the inertial frame 
S’. At a proper time T+ dr the rest mass is m— dm and it has acquired a velocity increment du relative to 
S" and propellant of rest mass dm, has been expelled with velocity vp relative to S’. At proper time T in S' 
the rest mass is mc”. At the time T + dr, energy conservation requires that 


Yu (m — dm) e? + Yop Mpe? = me? 
At the same instant, conservation of linear momentum requires 
Yw (m — dm) du’ — y,vpdmy = 0 


To first order these two equations simplify to 


_ _ (my 
dm, = 4/1 (2) dm 
mdu’ = dmpY,,Up 
Therefore 
mdu’ = vpdm (a) 


The velocity increment du’ in frame S” can be transformed back to frame S using equation 17.5, that is 


du’ 2 
d+ dua PE wut (1 (2) ae! (b) 
1+ a c 
Equations a and b yield a differential equation for u(m) of 
du a dm 
O 


Integrate the left-hand side between 0 and u and the right-hand side between M and m gives 
1 1+4 m 
gem ía u = —vp ln (=) 


u z 1— (aires 
ec 


When € — 0 this equation reduces to the non-relativistic answer given in equation 2.123. 


This reduces to 
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17.5 Geometry of space-time 


17.5.1 Four-dimensional space-time 


In 1906 Poincaré showed that the Lorentz transformation can be regarded as a rotation in a 4-dimensional 
Euclidean space-time introduced by adding an imaginary fourth space-time coordinate ict to the three 
real spatial coordinates. In 1908 Minkowski reformulated Einstein’s Special Theory of Relativity in this 4- 
dimensional Euclidean space-time vector space and concluded that the spatial variables q;, where (i = 1, 2,3), 
plus the time qu = ¿ct are equivalent variables and should be treated equally using a covariant representation 
of both space and time. The idea of using an imaginary time axis ict to make space-time Euclidean was 
elegant, but it obscured the non-Euclidean nature of space-time as well as causing difficulties when generalized 
to non-inertial accelerating frames in the General Theory of Relativity. As a consequence, the use of the 
imaginary ict has been abandoned in modern work. Minkowski developed an alternative non-Euclidean 
metric that treats all four coordinates (ct, x, y, z) as a four-dimensional Minkowski metric with all coordinates 
being real, and introduces the required minus sign explicitly. 

Analogous to the usual 3-dimensional cartesian coordinates, the displacement four vector ds is defined 
using the four components along the four unit vectors in either the unprimed or primed coordinate frames. 


ds = dx? êg + dx!tê + dx?8, + dxé3 = dx” ê + dote, + d1%8, + dx” ê} (17.31) 
The convention used is that greek subscripts (covariant) or superscripts (contravariant) designate a four 
vector with 0 < y < 3. The covariant unit vectors ê, are written with the subscript y which has 4 values 
0 < u <3. As described in appendix £3, using the Einstein convention the components are written with 
the contravariant superscript du! where the time axis z? = ct, while the spatial coordinates, expressed in 
cartesian coordinates, are x! = x, £? = y, and g? = z. With respect to a different (primed) unit vector basis 
ê, the displacement must be unchanged as given by equation 17.31. In addition, equation 17.43 shows that 
the magnitude Ids|? of the displacement four vector is invariant to a Lorentz transformation. 
The most general Lorentz transformation between inertial coordinate systems S and S”, in relative motion 
with velocity v, assuming that the two sets of axes are aligned, and that their origins overlap when t = t = 0, 
is given by the symmetric matrix A where 


grey Apa” (17.32) 


This Lorentz transformation of the four vector X components can be written in matrix form as 
X’ = AX (17.33) 


Assuming that the two sets of axes are aligned, then the elements of the Lorentz transformation Ay, are 
given by 


, y =b -1B2 =1B3 
a g B,6 B,8 E 
i a 41 Laas Dae ISA x! ma 
= = 2 . P 
I) 48, 008% 1+0-0% (0-142 2 ee 
z3 La ae B B 92 ME 
=83 0-04 (=D. Tele 


1 

=p? 
For the case illustrated in figure 17.1, where the corresponding axes of the two frames are parallel and in 

relative motion with velocity v in the x, direction, then the Lorentz transformation matrix 17.34 reduces to 


where P = 2 and y = and assuming that the origin of S transforms to the origin of S” at (0,0, 0,0). 


ct! y —By 0 0 d 
a - 0 0 z 
£2 | af o 10] | 2%? (Tea) 
r’ 0 0 0 1 x? 


This Lorentz transformation matrix is called a standard boost since it only boosts from one frame to another 
parallel frame. In general a rotation matrix also is incorporated into the transformation matrix A for the 
spatial variables. 
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17.5.2 Four-vector scalar products 


Scalar products of vectors and tensors usually are invariant to rotations in three-dimensional space providing 
an easy way to solve problems. The scalar, or inner, product of two four vectors is defined by 


10 0 0 ye 
0 -1 0 0 y! 
: ERA HYY — 0 1 2 3 i s 
X- Y Juv X Y ( X? X Xt X ) 0 0 1.0 y? (17.36) 
0.0 0 -1 y? 
= X°*y?_ xly!_ x*y?_ xsy8 
The correct sign of the inner product is obtained by inclusion of the Minkowski metric g defined by 
Juv = ê, j ê, (17.37) 
that is, it can be represented by the matrix 
10 0 0 
0-10. 0 
I=l 00 1 0 (17.38) 
0 0 0 —1 


The sign convention used in the Minkowski metric, equation 17.38, has been chosen with the time coordinate 
(ct)? positive which makes (ds)? > 0 for objects moving at less than the speed of light and corresponds to 
ds being real.? 

The presence of the Minkowski metric matrix, in the inner product of four vectors, complicates General 
Relativity and thus the Einstein convention has been adopted where the components of the contravariant 
four-vector X are written with superscripts X*. See also appendix E. The corresponding covariant four- 
vector components are written with the subscript X, which is related to the contravariant four-vector 
components X” using the uv component of the covariant Minkowski metric matrix g. That is 


3 
X=) GX” (17.39) 
v=0 


The contravariant metric component g*” is defined as the uv component of the inverse metric matrix g~! 


where 
gg =I=g'g (17.40) 
where I is the four-vector identity matrix. The contravariant components of the four vector can be expressed 


in terms of the covariant components as 
3 


ANG, (17.41) 
v=0 
Thus equations 17.39 and 17.41 can be used to transform between covariant and contravariant four vectors, 
that is, to raise or lower the index p. 
The scalar inner product of two four vectors can be written compactly as the scalar product of a covariant 
four vector and a contravariant four vector. The Minkowski metric matrix can be absorbed into either X or 


Y thus dl E E 
A =X XY" =) X"Y, (17.42) 
p=0 v=0 v=0 p=0 


If this covariant expression is Lorentz invariant in one coordinate system, then it is Lorentz invariant in all 
coordinate systems obtained by proper Lorentz transformations. 


2Older textbooks, such as all editions of Marion, and the first two editions of Goldstein, use the Euclidean Poincaré 4- 
dimensional space-time with the imaginary time axis ict. About half the scientific community, and modern physics textbooks 
including this textbook and the 37% edition of Goldstein, use the Bjorken - Drell +,—,—,—, sign convention given in equation 
17.38 where to = ct, and x1,2%2,2%3 are the spatial coordinates. The other half of the community, including mathematicians 
and gravitation physicists, use the opposite —, +, +, +, sign convention. Further confusion is caused by a few books that assign 
the time axis ct to be x4 rather than zo. 
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The scalar inner product of the invariant space-time interval is an especially important example. 


3 
(ds)? = X-X=c? (dt)? — (dr)? = (cdt)? — Y da? = (cdr)? (17.43) 


i=1 


This is invariant to a Lorentz transformation as can be shown by applying the Lorentz standard boost 
transformation given above. In particular, if S’ is the rest frame of the clock, then the invariant space-time 
interval ds is simply given by the proper time interval dr. 


17.5.3 Minkowski space-time 


Figure 17.6 illustrates a three-dimensional (ct, al, x?) representation of the 4—dimensional space-time dia- 
gram where it is assumed that 2% = 0. The fact that the velocity of light has a fixed velocity leads to the 
concept of the light cone defined by the locus of |z| = ct. 


Inside the light cone 


The vertex of the cones represent the present. Locations in- 
side the upper cone represent the future while the past is 


represented by locations inside the lower cone. Note that A T a 

(ds)? =c? (dt)? — (dr)? > 0 inside both the future and past F w ha 
light cones. Thus the space-time interval cAt is real and pos- L = l 
itive for the future, whereas it is real and negative for the - a 

past relative to the vertex of the light cone. A world line Sure LIGHT COM 


is the trajectory a particle follows is a function of time in 
Minkowski space. In the interior of the future light cone 
At > 0 and, since it is real, it can be asserted unambiguously OBSERVER ——_> 
that any point inside this forward cone must occur later than 
at the vertex of the cone, that is, it is the absolute future. 
A Lorentz transformation can rotate Minkowski space such 
that the axis £o goes through any point within this light cone 
and then the “world line” is pure time like. Similarly, any 
point inside the backward light cone unambiguously occurred 
before the vertex, i.e. it is absolute past. 


~ 


_ FAST LIGHT CONE 


Outside the light cone 


Outside of the light cone, has (ds)? =c? (dt)? — (dr)? < 0 
and thus As is imaginary and is called space like. A space- 
like plane hypersurface in spatial coordinates is shown for the 
present time in the unprimed frame. A rotation in Minkowski 
space can be made to s’ such that the space-like hypersurface 
now is tilted relative to the hypersurface shown and thus any 
point P outside the light cone can be made to occur later, 
simultaneous, or earlier than at the vertex depending on the 
orientation of the space-like hypersurface. This startling situation implies that the time ordering of two 
points, each outside the others light cone, can be reversed which has profound implications related to the 
concept of simultaneity and the notion of causality. 

For the special case of two events lying on the light cone Y a? = ct? — (x? +23 + £3) = 0 and thus 
these events are separated by a light ray travelling at velocity c. Only events separated by time-like intervals 
can be connected causally. The world line of a particle must lie within its light cone. The division of intervals 
into space-like and time-like, because of their invariance, is an absolute concept. That is, it is independent 
of the frame of reference. 

The concept of proper time can be expanded by considering a clock at rest in frame S’ which is moving 
with uniform velocity v with respect to a rest frame S. The clock at rest in the 5” frame measures the proper 


Figure 17.6: The light cone in the 
ct, 11,12 space is defined by the condition 
X-X =t? — r? = 0 and divides space-time 
into the forward and backward light cones, 
with t > 0 and t < 0 respectively; the interi- 
ors of the forward and backward light cones 
are called absolute future and absolute past. 
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time T, then the time observed in the fixed frame can be obtained by looking at the interval ds. Because of 
the invariance of the interval, ds? then 


ds? = dr? = dt? — [dx] + dx} + dx3] (17.44) 
That is, 
e 1 
(dx? + dz} + dx) |” v]? dt 
= = | Ge 17.4 
dr = dt |1 2d? dt | =] a (17.45) 


that is dt = ydr which satisfies the normal expression for time dilation, 17.8. 


17.5.4 Momentum-energy four vector 


The previous four-vector discussion can be elegantly exploited using the covariant Minkowski space-time 
representation. Separating the spatial and time of the differential four vector gives 


dX = (cdt, dx) (17.46) 


Remember that the square of the four-dimensional space-time element of length (ds)? is invariant (17.43), 
and is simply related to the proper time element dr. Thus the scalar product 


dX-dX = ds? = dr? = cdt? — [dez + dx} + dz3] (17.47) 


Thus the proper time is an invariant. 
The ratio of the four-vector element dX and the invariant proper time interval dr, is a four-vector called 
the four-vector velocity U where 


dX dt dx dx 
= A 2 — — = — == 1 A 
v- (=.=) a (e ~) vu (0) (17.48) 
=L a, 
y (1-27) 
The four-vector momentum P can be obtained from the four-vector velocity by multiplying it by the 
scalar rest mass m 


where u is the particle velocity, and y,, = 


P = mU = (yume, y, mu) (17.49) 
However, 
E 
c 


yume = (17.50) 


thus the momentum four vector can be written as 
E 
P= (=r) (17.51) 
c 


where the vector p represents the three spatial components of the relativistic momentum. It is interesting to 
realize that the Theory of Relativity couples not only the spatial and time coordinates, but also, it couples 
their conjugate variables linear momentum p and total energy, E, 

An additional feature of this momentum-energy four vector P, is that the scalar inner product P - P is 
invariant to Lorentz transformations and equals (mc)? in the rest frame 


3 3 3 3 
v v E 2 
PPSS eh =) eee = (pe (17.52) 
p=0 v=0 p=0 v=0 
which leads to the well-known equation 
E? =p? + EB (17.53) 
The Lorentz transformation matrix can be applied to P 
P = AP (17.54) 


The Lorentz invariant four-vector representation is illustrated by applying the Lorentz transformation 
shown in figure 17.1, which gives, p} = y (pı — E)E), Po = Pa, p3 = p3, and E' = q (E — vpı). 
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17.6 Lorentz-invariant formulation of Lagrangian mechanics 


17.6.1 Parametric formulation 


The Lagrangian and Hamiltonian formalisms in classical mechanics are based on the Newtonian concept 
of absolute time t which serves as the system evolution parameter in Hamilton’s Principle. This approach 
violates the Special Theory of Relativity. The extended Lagrangian and Hamiltonian formalism is a para- 
metric approach, pioneered by Lanczos|La49], that introduces a system evolution parameter s that serves 
as the independent variable in the action integral, and all the space-time variables q;(s), t(s) are dependent 
on the evolution parameter s. This extended Lagrangian and Hamiltonian formalism renders it to a form 
that is compatible with the Special Theory of Relativity. The importance of the Lorentz-invariant extended 
formulation of Lagrangian and Hamiltonian mechanics has been recognized for decades.[La49, Go50, Sy60] 
Recently there has been a resurgence of interest in the extended Lagrangian and Hamiltonian formalism 
stimulated by the papers of Struckmeier[Str05, Str08] and this formalism has featured prominently in recent 
textbooks by Johns[Jo05] and Greiner[Gr10]. This parametric approach develops manifestly-covariant La- 
grangian and Hamiltonian formalisms that treat equally all 2n +1 space-time canonical variables. It provides 
a plausible manifestly-covariant Lagrangian for the one-body system, but serious problems exist extending 
this to the N-body system when N > 1. Generalizing the Lagrangian and Hamiltonian formalisms into the 
domain of the Special Theory of Relativity is of fundamental importance to physics, while the parametric 
approach gives insight into the philosophy underlying use of variational methods in classical mechanics.? 

In conventional Lagrangian mechanics, the equations of motion for the n generalized coordinates are 
derived by minimizing the action integral, that is, Hamilton’s Principle. 


ssaa 6 i, ” Halt), q(t).at = 0 (17.55) 


where L(q(t), q(t),t) denotes the conventional Lagrangian. This approach implicitly assumes the Newtonian 
concept of absolute time t which is chosen to be the independent variable that characterizes the evolution 
parameter of the system. The actual path [q(t), q(t)] the system follows is defined by the extremum of the 
action integral S(q,q,t) which leads to the corresponding Euler-Lagrange equations. This assumption is 
contrary to the Theory of Relativity which requires that the space and time variables be treated equally, 
that is, the Lagrangian formalism must be covariant. 


17.6.2 Extended Lagrangian 


Lanczos|La49] proposed making the Lagrangian covariant by introducing a general evolution parameter s, 
and treating the time as a dependent variable t(s) on an equal footing with the configuration space variables 
g(s). That is, the time becomes a dependent variable go(s) = ct(s) similar to the spatial variables q,,(s) 
where 1 < u < n. The dynamical system then is described as motion confined to a hypersurface within an 
extended space where the value of the extended Hamiltonian and the evolution parameter s constitute an 
additional pair of canonically conjugate variables in the extended space. That is, the canonical momentum 


Po, corresponding to qu = ct, is po = £ similar to the momentum-energy four vector, equation 17.51. 


An extended Lagrangian L(q(s), 22 t(s), 42) can be defined which can be written compactly as 


L(q*(s), da (s)) where the index 0 < u < n denotes the entire range of space-time variables. 


This extended Lagrangian can be used in an extended action functional S(q, = t, d) to give an extended 


version of Hamilton’s Principle* 


H 
5S(q Ay =) z =0 f ue LN as =0 (17.56) 


¿Chapters 17.6 and 17.7 reproduce the Struckmeier presentation.[Str08] 

4These formula involve total and partial derivatives with respect to both time, t and parameter s. For clarity, the derivatives 
are written out in full because Lanczos[La49] and Johns[Jo05] use the opposite convention for the dot and prime superscripts 
as abbreviations for the differentials with respect to t and s. The blackboard bold format is used to designate the extended 
versions of the action S, Lagrangian L and Hamiltonian H. 
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The conventional action S, and extended action S, address alternate characterizations of the same underlying 
physical system, and thus the action principle implies that 95 = ôS = 0 must hold simultaneously. That is, 


b 
da, yt da, dt 
af Liq a7, LET 5 fria q, as de —)ds (17.57) 
As discussed in chapter 9.3, there is a continuous spectrum of equivalent gauge-invariant Lagrangians for 


which the Euler-Lagrange equations lead to identical equations of motion. Equation 17.57 is satisfied if the 
conventional and extended Lagrangians are related by 


dq , dt dq „dt  dA(q,t) 
L(q,—,t, —) = L it 17.58 
(E 5) = Hays + (17.58) 
where A(q,t) is a continuous function of q and t that has continuous second derivatives. It is acceptable to 


assume that alat) = 0, then the extended and conventional Lagrangians have a unique relation requiring 
no simultaneous transformation of the dynamical variables. That is, assume 
dq , dt dq , dt 
L(q,—,t, 7) = L it 17.59 
(e = 1a. So (17.59) 
Note that the time derivative of q can be expressed in terms of the s derivatives by 
dq  dq/ds 
aL E 17.60 
dt  dt/ds ( ) 


Thus, for a conventional Lagrangian with n variables, the corresponding extended Lagrangian is a function 
of n + 1 variables while the conventional and extended Lagrangians are related using equations 17.59, and 
17.60. 

The derivatives of the relation between the extended and conventional Lagrangians lead to 


OL OL dt 


OL OL dt 
a a (17.62) 
(E) al) 
OL 7 oL dq” 
== LN ee (17.64) 
AE a 


where 1 < u < n since the u = 0 time derivatives are written explicitly in equations 17.62, 17.64. 
Equations 17.63 — 17.64, summed over the extended range 0 < u < n of time and spatial dynamical 
variables, imply 


m oL a) 1 T ðL dde $ OL d _ ae 
BO ds aCA dt is 2 o(a) ds ( ) 


Equation 17.65 can be written in the form 


= OL dq” 40 if Li in YE 
L 5 qi= = zi 7 L is not homogeneous a t (17.66) 
a (4) ds = 0 if L is homogeneous in HE 
p=0 s 
If the extended Lagrangian L(q, a it, a) j is homogeneous to first order in the n+1 variables “E , then Euler’s 


theorem on homogeneous functions trivially implies the relation given in eee 17.66. stos 
identified a subtle but important point that if L is not homogeneous in da then equation 17.66 is not an 
identity but is an implicit equation that is always satisfied as the system evolves according to the solution 
of the extended Euler-Lagrange equations. Then equation 17.59 is satisfied without it being a homogeneous 
form in the n+1 velocities a This introduces a new class of non-homogeneous Lagrangians. The relativistic 
free particle, discussed in example 17.5, is a case of a non-homogeneous extended Lagrangian. 
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17.6.3 Extended generalized momenta 


The generalized momentum is defined by 
OL 


Pi = AN 
ðq” 
(%) 
Assume that the definitions of the extended Lagrangian L, and the extended Hamiltonian H, are related 
by a Legendre transformation, and are based on variational principles, analogous to the relation that exists 


between the conventional Lagrangian L and Hamiltonian H. The Legendre transformation requires defining 


the extended generalized (canonical) momentum-energy four vector P(s)= (22 p(s)). The momentum 
) 


(17.67) 


Y 


components of the momentum-energy four vector P(s)= (=~, p(s)) are given by the 1 < u < n components 


using equation 17.63. 


(s) OL OL 

DUNS) aay. Lag 
oe) aE) 

The u = 0 component of the momentum-energy four vector can be derived by recognizing that the right-hand 


side of equation 17.64 is equal to —H(p,,q",t). That is, the corresponding generalized momentum po, that 
is conjugate to qo = ct, is given by 


(17.68) 


OL 1/ Sole 1 ? OL dee H(py, q, t) 
SS A es ae | An (17.69) 
(E) (E) Eo] a i 


17.6.4 Extended Lagrange equations of motion 


By direct analogy with the non-relativistic action integral 17.55, the extremum for the relativistic action 
integral S(a, Lt, de) is obtained using the Euler-Lagrange equations derived from equation 17.56 where the 
independent variable is s. This implies that for 0< u< n 
d OL OL ll ag cdt 
pt A li A = = 17.70 
ds (ofa) T 2 is Seo 
ds 


where the extended generalized force Qex shown on the right-hand side of equation 17.70, accounts for all 
forces not included in the potential energy term in the Lagrangian. The extended generalized force QF * can 
be factored into two terms as discussed in chapter 6, equation 6.60. The Lagrange multiplier term includes 
1 < k < m holonomic constraint forces where the m holonomic constraints, which do no work, are expressed 
in terms of the m algebraic equations of holonomic constraint gx. The QIX C term includes the remaining 
constraint forces and generalized forces that are not included in the Lagrange multiplier term or the potential 
energy term of the Lagrangian. 
For the case where u = 0, since go = ct, then equation 17.70 reduces to 


d OL ƏL dt, Og, CO rexcd 
= = 17.71 
ds (5) Ot = ae Ot 20 ee) 


These Euler-Lagrange equations of motion 17.70, 17.71 determine the 1 < u < n generalized coordinates 
a*(s), plus q? = ct(s) in terms of the independent variable s. 


If the holonomic equations of constraint are time independent, that is ta = 0 and if QX C = 0, then 
the u = 0 term of the Euler-Lagrange equations simplifies to 
d OL OL 

>| =0 17.72 

ds (z =] ai ERR 


One interpretation is to select L to be primary. Then L is derived from L using equation 17.59 and L 
must satisfy the identity given by equation 17.66 while the Euler-Lagrange equations containing dl yield an 
identity which implies that L does not provide an equation of motion in terms of t(s). Conversely, if L is 
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chosen to be primary, then L is no longer a homogeneous function and equation 17.66 serves as a constraint 
on the motion that can be used to deduce L, while d yields a non-trivial equation of motion in terms of 
t(s). In both cases the occurrence of a constraint surface results from the fact that the extended space has 
2n +2 variables to describe 2n + 1 degrees of freedom, that is, one more degree of freedom than required for 
the actual system. 


17.5 Example: Lagrangian for a relativistic free particle 


The standard Lagrangian L = T — U is not Lorentz invariant. The extended Lagrangian L(q, sa, t, at) 


introduces the independent variable s which treats both the space variables q(s) and time variable qo = ct(s) 
equally. This can be achieved by defining the non-standard Lagrangian 


dq, dt, 1 |1 /da\? ¿des 
La St, Ž) = ime 3(3) (By (a) 


The constant third term in the bracket is included to ensure that the extended Lagrangian converges to the 


standard Lagrangian in the limit de > 1. 


Note that the extended Lagrangian (a) is not homogeneous to first order in the velocities a as is required. 


Equation 17.66 must be used to ensure that equation (a) is homogeneous. That is, it must satisfy the 


constraint relation 5 3 
dt 1 /daq 
—] -=|=)] -1=0 
(5) e? ( ds ) (9) 


Inserting (B) into the extended Lagrangian (a) yields that the square bracket in equation a must equal 2. 
Thus 


|L| = me [-2] = -me (7) 


The constraint equation (3) implies that 


ds _ Lf day 
E 3 (7) 4 (9) 


Using equation (6) gives that the relativistic Lagrangian is 


L 2 
A = —mce*,/1— fp? (e) 
Y Y 


Equation (e) is the conventional relativistic Lagrangian derived by assuming that the system evolution para- 
meter s is transformed to be along the world line ds, where the invariant length ds replaces the proper time 
interval 


dt 
=a (e) 
Vi 


The definition of the generalized (canonical) momentum 


pi = g T (¢) 


leads to the relativistic expression for momentum given in equation 17.21. 

The relativistic Lagrangian is an important example of a non-standard Lagrangian. Equation (a) does not 
equal the difference between the kinetic and potential energies, that is, the relativistic expression for kinetic 
energy is given by 17.28 to be 

T = (y—-1) me? (n) 


The non-standard relativistic Lagrangian (e) can be used with the Euler-Lagrange equations to derive the 
second-order equations of motion for both relativistic and non-relativistic problems within the Special Theory 
of Relativity. 
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17.6 Example: Relativistic particle in an external electromagnetic field 


A charged particle moving at relativistic speed in an external electromagnetic field provides an example 
of the use of the relativistic Lagrangian. 

In the discussion of classical mechanics it was shown that the velocity-dependent Lorentz force can be 
absorbed into the scalar electric potential ® plus the vector magnetic potential A. That is, the potential 
energy is given by equation 7.6 to be U = q(®— A - v). Including this in the Lagrangian, 17.71, gives 


The three spatial partial derivatives can be written in vector notation as 


OL 
2 = -qV + ÎÍV(v-A) (a) 
Or c 

and the generalized momentum is given by 


L 
p= E = mv +44 


which is identical to the non-relativistic answer given by equation 7.6. That is, it includes the momentum of 
the electromagnetic field plus the classical linear momentum of the moving particle. 
The total time derivative of the generalized momentum is 


dp d [ðI\ d dA 
dt dt (=) Sg aa erg (0) 


where the last term is given by the chain rule 


Using equations a,b,c in the Euler-Lagrange equation gives 


a (a) _ ab 
dt \ dv Or 
d dA 
q mv) + a = -—qV®+qV(v-A) 


Collecting terms and using the well-known vector-product identity, plus the definition B = V x A, gives 


L (mv) = - javo - a$ +q[V(v- A) —(v-V)A] 
= a va SP] Hale xal 
F = q[E+vxB) 


If we adopt the definition that the relativistic canonical momentum is p = ymw then the left hand side is 
the relativistic force while the right-hand side is the well-known Lorentz force of electromagnetism. Thus 
the extended Lagrangian formulation correctly reproduces the well-known Lorentz force for a charged particle 
moving in an electromagnetic field. 
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17.7 Lorentz-invariant formulations of Hamiltonian mechanics 


17.7.1 Extended canonical formalism 


A Lorentz-invariant formulation of Hamiltonian mechanics can be developed that is built upon the extended 
Lagrangian formalism assuming that the Hamiltonian and Lagrangian are related by a Legendre transfor- 
mation. That is, 


= ðq” ð 
H(q,p.t = oor sy — Ha, pot) (17.73) 


where the generalized momentum is defined by 


OL 
PIN 

Ogt 
0) 
Struckmeier[Str08] assumes that the definitions of the extended Lagrangian L, and the extended Hamil- 
tonian H, are related by a Legendre transformation, and are based on variational principles, analogous to the 
relation that exists between the conventional Lagrangian L and Hamiltonian H. The Legendre transforma- 
tion requires defining the extended generalized (canonical) momentum-energy four vector P(s)= (= (s) p(s)). 


c? 
The momentum components of the momentum-energy four vector P(s)= (22) p(s)) are given by the 1 < 
u <n components using either the conventional or the extended Lagrangians as given in equation 17.68 


OL OL 


(17.74) 


Pal) = ay = 70 (17.68) 
oe) oC) 
The y = 0 component of the momentum-energy four vector is given by equation 17.69 
1 OL A(p,,q",t E(s 
peas eae (17.75) 
c\a(4#) c c 


where E(s) represents the instantaneous generalized energy of the conventional Hamiltonian at the point s, 
but not the functional form of H(q(s), p(s), t(s)). That is 


E(s)2H(q(s), p(s), t(s)) (17.76) 
Note that €(s) does not give the function H(q, p,t). Equations 17.68 and 17.69 give that 
E(s 
Po(s) = sa (17.77) 


The extended Hamiltonian H(q, p,t,€(s)), in an extended phase space, can be defined by the Legendre 
transformation and the four-vector P to be 


dq , dt 
H(a,p.t,(s)) = (Pa) Lía qt 37) (17.78) 
€ dq" dq , dt 
= Yn e) a 
p=0 
E dq! cit dq , dt 
= re =, a A L Cee 1 > 
dm (de) E ds (q dT Ë q) (17.79) 


where the po term has been written explicitly as —€ t in equation 17.79. The extended Hamiltonian 
H((q, p,t,E(s)) can carry all the information on the dynamical system that is carried by the extended 
Lagrangian L(q, Y Tat» t), if the Hesse matrix is non-singular. That is, if 


det TEA 40 (17.80) 
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If the extended Lagrangian L(q,34,t, d) is not homogeneous in the n+1 velocities ag then the extended 
set of Euler-Lagrange equations 17.72 is not redundant. Thus equation 17.66 is not an identity but it can be 
regarded as an implicit equation that is always satisfied by the extended set of Euler-Lagrange equations. As 
a result, the Legendre transformation to an extended Hamiltonian exists. That is, equation 17.66 is identical 
to the Legendre transform for H((q, p,t,€(s)) which was shown to equal zero. Therefore 


H(q(s), p(s), t(s),E(s)) = 0 (17.81) 


which means that the extended Hamiltonian H((q, p, t, E(s)) directly defines the restricted hypersurface on 
which the particle motion is confined. 

The extended canonical equations of motion, derived using the extended Hamiltonian H(q(s), p(s), t(s), € (s)) 
with the usual Hamiltonian mechanics relations, are: 


OH dq 


17) 
OH dp 
OH de 
—_ = >— 17.84 
Ot ds ( ) 
oH dt 
Set eS o 17. 
OE ds ee) 
These canonical equations give that the total derivative of H((q(s), p(s), t(s), E(s)) with respect to s, is 
dH OH dp, i OH dq” $ OH dt ES OH dE 
ds Op, ds ðq” ds Ot ds E ds 
dq” d dp,, dq” dt dtdE 
= e a BE ey (17.86) 


ds ds ds ds dsds dsds — 
That is, in contrast to the total time derivative of H(q,p,t), the total s derivative of the extended Hamil- 
tonian H((q(s), p(s), t(s), € (s)) always vanishes, that is, H(q(s), p(s), t(s), E(s)) is autonomous which is ideal 
for use with Hamilton’s equations of motion. The constraints give that H(q(s), p(s), t(s),E(s)) = 0, (equation 
17.81) and Æ = 0, (equation 17.86) implying that the correlation between the extended and conventional 
Hamiltonians is given by 


E dq! dt dq , dt 
HADE = Dor (GE) -eF -aa 2) (17.87) 

H= 

y dat ¿dt ,, dq,,dt 

= 2? ES Es Hat) (17.88) 

o (1) -et ENCA 

= dP ES €, + |H(a,p,t) (E ) ga (17.89) 

dt 
= (Ha, p.t) -€) 7 =0 (17.90) 


since only the term with u = 0 does not cancel in equation 17.79. Equations 17.81 and 17.90 give that both the 
left and right-hand sides of equation 17.90 are zero while equation 17.86 implies that H((q(s), p(s), t(s), €(s)) 
is a constant of motion, that is, s is a cyclic variable for H((q(s), p(s), t(s), € (s)). Formally one can consider 
the extended Hamiltonian is a constant which equals zero 


H(q, p,t,€(s)) = E(s) = 0 (17.91) 


Equations 17.84, 17.85 imply that (€,t) form a pair of canonically conjugate variables in addition to the 
newly-introduced canonically-conjugate variables (E(s),s). Equation 17.90 shows that the motion in the 
2n + 2 extended phase space is constrained to the surface reflecting the fact that the observed system has 
one less degree of freedom than used by the extended Hamiltonian. 

In summary, the Lorentz-invariant extended canonical formalism leads to Hamilton’s first-order equations 
of motion in terms of derivatives with respect to s, where s is related to the proper time 7 for a relativistic 
system. 


486 CHAPTER 17. RELATIVISTIC MECHANICS 


17.7.2 Extended Poisson Bracket representation 


Struckmeier[Str08] investigated the usefulness of the extended formalism when applied to the Poisson bracket 
representation of Hamiltonian mechanics. The extended Poisson bracket for two differentiable functions F 
and G is defined as 


lRal=> (2 aa) OF 0G OF 0G (17.92) 


2. \ dq) dp; Op; qi) “LOH IRA 
As for the conventional Poisson bracket discussed in chapter 15, the extended Poisson also leads to the 
fundamental Poisson bracket relations 


[Ia',@’]] =0 lpi, p;]] = 0 [la”,p5)] = 8; (17.93) 


where i, j = 0,1,...,n. These are identical to the non-extended fundamental Poisson brackets. 

The discussion of observables in Hamiltonian mechanics in chapter 15.2.5 can be trivially expanded to 
the extended Poisson bracket representation. In particular, the total s derivative of the function G is given 
by 


+. + [[G, H] (17.94) 


If G commutes with the extended Hamiltonian, that is, the Poisson bracket equals zero, and if 2E = 0, then 
qe = 0. That is, the observable G is a constant of motion. 


Substitute the fundamental variables for G gives 


AP y OH dq" OH 
ll ARES , H]] = == — = |[¢’, H]] = — 17.95 
Fe = (wlll = - 55 = = [lo", Hl (17.95) 
where i, 7 = 0,1,...,n. These are Hamilton’s extended canonical equations of motion expressed in terms of 
the system evolution parameter s. The extended Poisson bracket representation is a trivial extension of the 
conventional canonical equations presented in chapter 15.3. 


17.7.3 Extended canonical transformation and Hamilton-Jacobi theory 


Struckmeier[Str08] presented plausible extended versions of canonical transformation and Hamilton-Jacobi 
theories that can be used to provide a Lorentz-invariant formulation of Hamiltonian mechanics for relativistic 
one-body systems. A detailed description can be found in Struckmeier[Str08].° 


17.7.4 Validity of the extended Hamilton-Lagrange formalism 


It has been shown that the extended Lagrangian and Hamiltonian formalism, based on the parametric model 
of Lanczos[La49], leads to a plausible manifestly-covariant approach for the one-body system. The general 
features developed for handling Lagrangian and Hamiltonian mechanics carry over to the Special Theory 
of Relativity assuming the use of a non-standard, extended Lagrangian or Hamiltonian. This expansion of 
the range of validity of the well-known Hamiltonian and Lagrangian mechanics into the relativistic domain 
is important, and reduces any Lorentz transformation to a canonical transformation. The validity of this 
extended Hamilton-Lagrange formalism has been criticized, and problems exist extending this approach to 
the N-body system for N > 1. For example, as discussed by Goldstein[Go50] and Johns[Jo05], each of 
the N moving bodies have their own world lines and momenta. Defining the total momentum P requires 
knowing simultaneously the momenta of the individual bodies, but simultaneity is body dependent and 
thus even the total momentum is not a simple four vector. A general method is required that will allow 
using a manifestly-covariant Lagrangian or Hamiltonian for the N-body system. For the one-body system, 
the extended Hamilton-Lagrange formalism provides a powerful and logical approach to exploit analytical 
mechanics in the relativistic domain that retains the form of the conventional Lagrangian/Hamiltonian 
formalisms. Note that Noether’s theorem relating energy and time is readily apparent using the extended 
formalism. 


5Note that Greiner[Gr10] includes a reproduction of the Struckmeier paper[Str08]. 
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17.7 Example: The Bohr-Sommerfeld hydrogen atom 


The classical relativistic hydrogen atom was first solved by Sommerfeld in 1916. Sommerfeld used Bohr’s 
“old quantum theory” plus Hamiltonian mechanics to make an important step in the development of quantum 
mechanics by obtaining the first-order expressions for the fine structure of the hydrogen atom. As in the 
non-relativistic case, the motion is confined to a plane allowing use of planar polar coordinates. Thus the 
relativistic Lagrangian is given by 


-2 
me? d 12+r28 ke? 
+ 


y c2 r 


The canonical momenta are given by 


OL 


= acy) pronto, 
Po 00 ai 
OE. , 
se a 
. OL 
Po = 99-9 
. OL -2 e? al 
Pr = Bp = mre tka 


As for the non-relativistic case, 0 is a cyclic variable and thus the 
angular momentum pọ = myr26 is conserved. 

The relativistic Hamiltonian for the Coulomb potential between an 
electron and the proton, assuming that the motion is confined to a 


plane, which allows use of planar polar coordinates, leads to The advance of the perihelion of 
bound orbits due to the dependence 
373 9 abe E 
c ke of the relativistic mass on velocity. 
H = pac? + FB + mact — E y 
y 


The same equations of motion are obtained using Hamiltonian mechanics, that is: 


va is 
Op myr? 
; OH Pr 
r = = 
Op, my 
OH 
Po 90 
f OH 2 e? 
Pr = ar = myr + ka 


The radial dependence can be solved using either Lagrangian or Hamiltonian mechanics, but the solution 
is non-trivial. Using the same techniques applied to solve Kepler’s problem, leads to the radial solution 


4 2122 T2(1 — mie 
E r= le yt Po c= Al des ci) E) 
1 + ecos|['(@ — 6] cps eE 1-T? 
The apses are Tmin = T for T(0 — 00) = 0,27, 4r, and Tmax = T for T(@ — 0o) = 7,37,. The 


perthelion advances between cycles due to the change in relativistic mass during the trajectory as shown in 
the adjacent figure. This precession leads to the fine structure observed in the optical spectra of the hydrogen 
atom. The same precession of the perihelion occurs for planetary motion, however, there is a comparable 
size effect due to gravity that requires use of general relativity to compute the trajectories. 
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17.8 The General Theory of Relativity 


Einstein's General Theory of Relativity expands the scope of relativistic mechanics to include non-inertial 
accelerating frames plus a unified theory of gravitation. That is, the General Theory of Relativity incorpo- 
rates both the Special Theory of Relativity as well as Newton’s Law of Universal Gravitation. It provides 
a unified theory of gravitation that is a geometric property of space and time. In particular, the curvature 
of space-time is directly related to the four-momentum of matter and radiation. Unfortunately, Einstein’s 
equations of general relativity are nonlinear partial differential equations that are difficult to solve exactly, 
and the theory requires knowledge of Riemannian geometry that goes beyond the scope of this book. The 
following summarizes the fundamental variational concepts underlying the theory, and the experimental 
evidence in support of the General Theory of Relativity. 


17.8.1 The fundamental concepts 


Einstein incorporated the following concepts in the General Theory of Relativity. 


Mach’s principle: 


The 1883 work “The Science of Mechanics” by the philosopher/physicist, Ernst Mach, criticized Newton’s 
concept of an absolute frame of reference, and suggested that local physical laws are determined by the large- 
scale structure of the universe. Mach’s Principle assumes that local motion of a rotating frame is determined 
by the large-scale distribution of matter, that is, relative to the fixed stars. Einstein’s interpretation of 
Mach’s statement was that the inertial properties of a body is determined by the presence of other bodies 
in the universe, and he named this concept “Mach’s Principle”. 


Equivalence principle: 


The equivalence principle comprises closely-related concepts dealing with the equivalence of gravitational and 
inertial mass. The weak equivalence principle states that the inertial mass and gravitational mass of a 
body are identical, leading to acceleration that is independent of the nature of the body. Galileo demonstated 
this at the Leaning Tower of Pisa. Recent measurements have shown that this weak equivalence principle 
is obeyed to a sensitivity of 5 x 10713. Einstein’s equivalence principle states that the outcome of 
any local non-gravitational experiment, in a freely falling laboratory, is independent of the velocity of the 
laboratory and its location in space-time. This principle implies that the result of local experiments must be 
independent of the velocity of the apparatus. Einstein’s equivalence principle has been tested by searching 
for variations of dimensionless fundamental constants such as the fine structure constant. The strong 
equivalence principle combines the weak equivalence and Einstein equivalence principles, and implies 
that the gravitational constant is constant everywhere in the universe. The strong equivalence principle 
suggests that gravity is geometrical in nature and does not involve any fifth force in nature. Tests of the 
strong equivalence principle have involved searches for variations in the gravitational constant G and masses 
of fundamental particles throughout the life of the universe. 


Principle of covariance 


A physical law that is expressed in a covariant formulation has the same mathematical form in all coordinate 
systems, and is usually expressed in terms of tensor fields. In the Special Theory of Relativity, the Lorentz, 
rotational, translational and reflection transformations between inertial coordinate frames are covariant. The 
covariant quantities are the 4-scalars, and 4-vectors in Minkowski space-time. Einstein recognized that the 
principle of covariance, that is built into the Special Theory of Relativity, should apply equally to accelerated 
relative motion in the General Theory of Relativity. He exploited tensor calculus to extend the Lorentz 
covariance to the more general local covariance in the General Theory of Relativity. The reduction locally 
of the general metric tensor to the Minkowski metric corresponds to free-falling motion, that is geodesic 
motion, and thus encompasses gravitation. 
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Principle of minimal gravitational coupling 


The principle of minimal gravitational coupling requires that the total Lagrangian for the field equations of 
general relativity consist of two additive parts, one part corresponding to the free gravitational Lagrangian, 
and the other part to external source fields in curved space-time. 


Correspondence principle 


The Correspondence Principle states that the predictions of any new scientific theory must reduce to the 
predictions of well established earlier theories under circumstances for which the preceding theory was known 
to be valid. The Correspondence Principle is an important concept used both in quantum mechanics and 
relativistic mechanics. Einstein’s Special Theory of Relativity satisfies the Correspondence Principle be- 
cause it reduces to classical mechanics in the limit of velocities small compared to the speed of light. The 
Correspondence Principle requires that the General Theory of Relativity reduce to the Special Theory of 
Relativity for inertial frames, and should approximate Newton’s Theory of Gravitation in weak fields and at 
low velocities. 


17.8.2 Einstein’s postulates for the General Theory of Relativity 


Einstein realized that the Equivalence Principle relating the gravitational and inertial masses implies that 
the constancy of the velocity of light in vacuum cannot hold in the presence of a gravitational field. That 
is, the Minkowskian line element must be replaced by a more general line element that takes gravity into 
account. Einstein proposed that the Minkowskian line element in four-dimensional space-time, be replaced 
by introducing a four-dimensional Riemannian geometrical structure where space, time, and matter are 
combined. As described by Lancos[La49], [Har03], [Mu08] this astonishingly bold proposal implies that 
planetary motion is described as purely a geodesic phenomenon in a certain four-space of Riemannian 
structure, where the geodesic is the equation of a curve on a manifold for any possible set of coordinates. 
This implies that the concept of “gravitational force” is discarded, and planetary motion is a manifestation 
of a pure geodesic phenomenon for forceless motion in a four-dimensional Riemannian structure. 

Chapters 6 — 9 showed that the Lagrangian and Hamiltonian representations of variational principles are 
powerful approaches for determining the equation governing geodesic constrained motion that are indepen- 
dent of the chosen frame of reference as is also required by the General Theory of Relativity. Thus variational 
principles provide a theoretical representation for the General Theory of Relativity. The Einstein-Hilbert 
action is defined as 


167Gc~4 


where G is Einstein’s gravitational constant, R is the Ricci scalar, L m accounts for matter fields, and g is the 
determinant of the metric tensor.matrix. Variational principles applied to the Einstein-Hilbert action lead to 
Einstein’s sophisticated and advanced relativistic field equations of the General Theory of Relativity. Thus 
the variational approach unifies relativistic mechanics and classical field theories, such as mechanics and 
electromagnatism, which also were formulated in terms of least action. In relativistic mechanics, the use of 
action identifies the gravitational coupling of the metric to matter as well as identifying conserved quantities 
and symmetries using Noether’s theorem. The Einstein-Hilbert action expands the scope of variational 
principles to include general relativity illustrating the crucial role played by variational principles in physics. 
To summarize, the Special Theory of Relativity implies that the Newtonian concepts of absolute frame 
of reference and separation of space and time are invalid. The General Theory of Relativity goes beyond 
the Special Theory by implying that the gravitational force, and the resultant planetary motion, can be 
described as pure geodesic phenomena for forceless motion in a four-dimensional Riemannian structure. 


s= || > R +Lm| V=gdx (17.96) 


17.8.3 Experimental evidence in support of the General Theory of Relativity 


The following experimental evidence in support of Einstein’s Theory of General Relativity is compelling. 


Kepler problem In 1915 Einstein showed that relativistic mechanics explained the anomalous precession 
of the perihelion of the planet mercury, that is, the axes of the elliptical Kepler orbit are observed to precess. 
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Deflection of light Einstein’s prediction of the deflection of light in a gravitational field was confirmed by 
Eddington during the solar eclipse of 29 May 1919. Pictures of stars in the region around the Sun showed that 
their apparent locations were slightly shifted because the light from the stars had been curved by passing 
close to the sun’s gravitational field. 


Gravitational lensing The deflection of light by the gravitational attraction of a massive object situated 
between a distant star and the observer has resulted in the observation of multiple images of a distant quasar. 


Gravitational time dilation and frequency shift Processes occurring in a high gravitation field are 
slower than in a weak gravitational field; this is called gravitational time dilation. In addition, light climbing 
out of a gravitational well is red shifted. The gravitational time dilation has been measured many times and 
the successful operation of the Global Position System provides an ongoing validation. The gravitational 
red shift has been confirmed in the laboratory using the precise Méssbauer effect in nuclear physics. Tests 
in stronger gravitational fields are provided by studies of binary pulsars. 


Black holes When the mass to radius ratio of a massive object becomes sufficiently large, general relativity 
predicts formation of a black hole, which is a region of space from which neither light nor matter can escape. 
Supermassive black holes, with a mass that can be 10° — 10? solar masses, are thought to have played an 
important role in formation of the galaxies. 


Gravitational waves detection In 1916 Einstein predicted the existence of gravitational waves on the 
basis of the theory of general relativity. The first implied detection of gravitational waves were made in 
1976 by Hulse and Taylor who detected a decrease in the orbital period due to significant energy loss which 
presumably was associated with emission of gravity waves by the compact neutron star in the binary pulsar 
PSR1913 + 16. The most compelling direct evidence for observation of a gravitational wave was made 
on 15 September 2015 by the LIGO Laser Interferometer Gravitational-Wave Observatories. The waveform 
detected by the two LIGO observatories matched the predictions of General Relativity for gravitational waves 
emanating from the inward spiral plus merger of a pair of black holes of around 36 and 29 solar masses, 
followed by the resultant binary black hole. The gravitational wave emitted by this cataclysmic merger 
reached Earth as a ripple in space-time that changed the length of the 4km LIGO arm by a thousandth of 
the width of the proton. The gravitational energy emitted was 3.0+9:502 solar masses. A second observation 
of gravitational waves was made on 26 December 2015, and four similar observations were made during 
2017. The detection of such miniscule changes in space-time is a truly remarkable achievement. This direct 
detection of gravitational waves resulted in the award of the 2017 Nobel Prize to Rainer Weiss, Barry 
Barish, and Kip Thorn. Gravitational wave detection has opened an exciting and powerful new frontier in 
astrophysics that could lead to exciting new physics. 


17.9 Implications of relativistic theory to classical mechanics 


Einstein’s theories of relativity have had an enormous impact on twentieth century physics and the philosophy 
of science. Relativistic mechanics is crucial to an understanding of the physics of the atom, nucleus and the 
substructure of the nucleons, but the impacts are minimal in everyday experience. The Special Theory of 
Relativity replaces Newton’s Laws of motion; i.e. Newton’s law is only an approximation applicable for low 
velocities. The General Theory of Relativity replaces Newton’s Law of Gravitation and provides a natural 
explanation of the equivalence principle. Einstein’s theories of relativity imply a profound and fundamental 
change in the view of the separation of space, time, and mass, that contradicts the basic tenets that are 
the foundation of Newtonian mechanics. The Newtonian concepts of absolute frame of reference, plus the 
separation of space, time, and mass, are invalid at high velocities. Lagrangian and Hamiltonian variational 
approaches to classical mechanics provide the formalism necessary for handling relativistic mechanics. The 
present chapter has shown that logical extensions of Lagrangian and Hamiltonian mechanics lead to the 
relativistically-invariant extended Lagrangian and Hamiltonian formulations of mechanics which are adequate 
for handling one-body systems. However, major unsolved problems remain applying these formulations to 
systems that have more than one body. 
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17.10 Summary 


Special theory of relativity: | The Special Theory of Relativity is based on Einstein’s postulates; 

1) The laws of nature are the same in all inertial frames of reference. 

2) The velocity of light in vacuum is the same in all inertial frames of reference. 
For a primed frame moving along the x, axis with velocity v Einstein’s postulates imply the following 
Lorentz transformations between the moving (primed) and stationary (unprimed) frames 


a = q(x — vt) x= y(a' +t’) 
y =y y=y 
A 2= 2 
t =q (t-23) t=7( +25) 


a 
ye) 

Lorentz transformations were used to illustrate Lorentz contraction, time dilation, and simultaneity. An 
elementary review was given of relativistic kinematics including discussion of velocity transformation, linear 
momentum, center-of-momentum frame, forces and energy. 


where the Lorentz y factor y = 


Geometry of space-time: The concepts of four-dimensional space-time were introduced. A discussion of 
four-vector scalar products introduced the use of contravariant and covariant tensors plus the Minkowski met- 
ric y where the scalar product was defined. The Minkowski representation of space time and the momentum- 
energy four vector also were introduced. 


Lorentz-invariant formulation of Lagrangian mechanics: The Lorentz-invariant extended La- 
grangian formalism, developed by Struckmeier[Str08], based on the parametric approach pioneered by 
Lanczos|La49], provides a viable Lorentz-invariant extension of conventional Lagrangian mechanics that 
is applicable for one-body motion in the realm of the Special Theory of Relativity. 


Lorentz-invariant formulation of Hamiltonian mechanics: The Lorentz-invariant extended Hamil- 
tonian formalism, developed by Struckmeier based on the parametric approach pioneered by Lanczos, was 
introduced. It provides a viable Lorentz-invariant extension of conventional Hamiltonian mechanics that is 
applicable for one-body motion in the realm of the Special Theory of Relativity. In particular, it was shown 
that the Lorentz-invariant extended Hamiltonian is conserved making it ideally suited for solving compli- 
cated systems using Hamiltonian mechanics via use of the Poisson-bracket representation of Hamiltonian 
mechanics, canonical transformations, and the Hamilton-Jacobi techniques. 


The General Theory of Relativity: | An elementary summary was given of the fundamental concepts 
of the General Theory of Relativity and the resultant unified description of the gravitational force plus 
planetary motion as geodesic motion in a four-dimensional Riemannian structure. Variational mechanics 
were shown to be ideally suited to applications of the General Theory of Relativity. 


Philosophical implications: Newton’s equations of motion, and his Law of Gravitation, that reigned 
supreme from 1687 to 1905, have been toppled from the throne by Einstein’s theories of relativistic mechanics. 
By contrast, the complete independence to coordinate frames in Lagrangian, and Hamiltonian formulations of 
classical mechanics, plus the underlying Principle of Least Action, are equally valid in both the relativistic and 
non-relativistic regimes. As a consequence, relativistic Lagrangian and Hamiltonian formulations underlie 
much of modern physics, especially quantum physics, which explains why relativistic mechanics plays such 
an important role in classical dynamics. 


492 CHAPTER 17. RELATIVISTIC MECHANICS 


Workshop exercises 


1. A relativistic snake of proper length 100cm is travelling to the right across a butcher’s table at v = 0.6c. You 
hold two meat cleavers, one in each hand which are 100cm apart. You strike the table simultaneously with 
both cleavers at the moment when the left cleaver lands just behind the tail of the snake. You rationalize that 
since the snake is moving with 8 = 0.6, then the length of the snake is Lorentz contracted by the factor y = 3 
and thus the Lorentz-contracted length of the snake is 80cm and thus will not be harmed. However, the snake 
reasons that relative to it the cleavers are moving at 8 = 0.6 and thus are only 80cm apart when they strike 
the 100cm long snake and thus it will be severed. Use the Lorentz transformation to resolve this paradox. 


2. Explain what is meant by the following statement: “Lorentz transformations are orthogonal transformations 
in Minkowski space.” 


3. Which of the following are invariant quantities in space-time? 


) Energy 
) Momentum 
(c) Mass 
(d) Force 
) Charge 
) The length of a vector 
) 


The length of a four-vector 


4. What does it mean for two events to have a spacelike interval? What does it mean for them to have a timelike 
interval? Draw a picture to support your answer. In which case can events be causally connected? 


Problems 


1. A supply rocket flies past two markers on the Space Station that are 50m apart in a time of 0.2j1s as measured 
by an observer on the Space station. 


(a) What is the separation of the two markers as seen by the pilot riding in the supply rocket? 
(b) What is the elapsed time as measured by the pilot in the supply rocket? 
(c) What are the speeds calculated by the observer in the Space Station and the pilot of the supply rocket? 


2. The Compton effect involves a photon of incident energy E, being scattered by an electron of mass Me which 
initially is stationary. The photon scattered at an angle O with respect to the incident photon has a final energy 
Ey. Using the special theory of relativity derive a formula that related Ef and F; to 0. 


3. Pair creation involves production of an electron-positron pair by a photon. Show that such a process is 
impossible unless some other body, such as a nucleus, is involved. Suppose that the nucleus has a mass M 
and the electron mass Me. What is the minimum energy that the photon must have in order to produce an 
electron-positron pair? 


4. A K meson of rest energy 494MeV decays into a u meson of rest energy 106MeV and a neutrino of zero 
rest energy. Find the kinetic energies of the u meson and the neutrino into which the K meson decays while 
at rest. 


Chapter 18 


The transition to quantum physics 


18.1 Introduction 


Classical mechanics, including extensions to relativistic velocities, embrace an unusually broad range of topics 
ranging from astrophysics to nuclear and particle physics, from one-body to many-body statistical mechanics. 
It is interesting to discuss the role of classical mechanics in the development of quantum mechanics which 
plays a crucial role in physics. A valid question is “why discuss quantum mechanics in a classical mechanics 
course?”. The answer is that quantum mechanics supersedes classical mechanics as the fundamental the- 
ory of mechanics. Classical mechanics is an approximation applicable for situations where quantization is 
unimportant. Thus there must be a correspondence principle that relates quantum mechanics to classical 
mechanics, analogous to the relation between relativistic and non-relativistic mechanics. It is illuminating to 
study the role played by the Hamiltonian formulation of classical mechanics in the development of quantal 
theory and statistical mechanics. The Hamiltonian formulation is expressed in terms of the phase-space 
variables q, p for which there are well-established rules for transforming to quantal linear operators. 


18.2 Brief summary of the origins of quantum theory 


The last decade of the 19%” century saw the culmination of classical physics. By 1900 scientists thought 
that the basic laws of mechanics, electromagnetism, and statistical mechanics were understood and worried 
that future physics would be reduced to confirming theories to the fifth decimal place, with few major new 
discoveries to be made. However, technical developments such as photography, vacuum pumps, induction 
coil, etc., led to important discoveries that revolutionized physics and toppled classical mechanics from its 
throne at the beginning of the 20'” century. Table 18.1 summarizes some of the major milestones leading 
up to the development of quantum mechanics. 

Max Planck searched for an explanation of the spectral shape of the black-body electromagnetic radia- 
tion. He found an interpolation between two conflicting theories, one that reproduced the short wavelength 
behavior, and the other the long wavelength behavior. Planck’s interpolation required assuming that electro- 
magnetic radiation was not emitted with a continuous range of energies, but that electromagnetic radiation 
is emitted in discrete bundles of energy called quanta. In December 1900 he presented his theory which 
reproduced precisely the measured black body spectral distribution by assuming that the energy carried by 
a single quantum must be an integer multiple of hv: 


E=hv= X (18.1) 
where v is the frequency of the electromagnetic radiation and Planck's constant, h = 6.62610~°4J - sec was 
the best fit parameter of the interpolation. That is, Planck assumed that energy comes in discrete bundles 
of energy equal to hv which are called quanta. By making this extreme assumption, in an act of desperation, 
Planck was able to reproduce the experimental black body radiation spectrum. The assumption that energy 
was exchanged in bundles hinted that the classical laws of physics were inadequate in the microscopic 
domain. The older generation physicists initially refused to believe Planck’s hypothesis which underlies 
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quantum theory. It was the new generation physicists, like Einstein, Bohr, Heisenberg, Born, Schrédinger, 
and Dirac, who developed Planck’s hypothesis leading to the revolutionary quantum theory. 

In 1905, Einstein predicted the existence of the photon, derived the theory of specific heat, as well 
as deriving the Theory of Special Relativity. It is remarkable to realize that he developed these three 
revolutionary theories in one year, when he was only 26 years old. Einstein uncovered an inconsistency in 
Planck’s derivation of the black body spectral distribution in that it assumed the statistical part of the energy 
is quantized, whereas the electromagnetic radiation assumed Maxwell’s equations with oscillator energies 
being continuous. Planck demanded that light of frequency y be packaged in quanta whose energies were 
multiples of hv, but Planck never thought that light would have particle-like behavior. Newton believed that 
light involved corpuscles, and Hamilton developed the Hamilton-Jacobi theory seeking to describe light in 
terms of the corpuscle theory. However, Maxwell had convinced physicists that light was a wave phenomena; 
interference plus diffraction effects were convincing manifestations of the wave-like properties of light. In 
order to reproduce Planck’s prediction, Einstein had to treat black-body radiation as if it consisted of a gas 
of photons, each photon having energy E = hv. This was a revolutionary concept that returned to Newton’s 
corpuscle theory of light. Einstein realized that there were direct tests of his photon hypothesis, one of which 
is the photo-electric effect. According to Einstein, each photon has an energy E = hv, in contrast to the 
classical case where the energy of the photoelectron depends on the intensity of the light. Einstein predicted 
that the ejected electron will have a kinetic energy 


KE=h—W (18.2) 


where W is the work function which is the energy needed to remove an electron from a solid. 

Many older scientists, including Planck, accepted Einstein’s theory of relativity but were skeptical of the 
photon concept, even after Einstein’s photon concept was vindicated in 1915 by Millikan who showed that, 
as predicted, the energy of the ejected photoelectron depended on the frequency, and not intensity, of the 
light. In 1923 Compton’s demonstrated that electromagnetic radiation scattered by free electrons obeyed 
simple two-body scattering laws which finally convinced the many skeptics of the existence of the photon. 


Table 18.1: Chronology of the development of quantum mechanics 


Date | Author Development 
1887 | Hertz Discovered the photo-electric effect 
Röntgen Discovered x-rays 
1896 | Becquerel Discovered radioactivity 
1897 | J.J. Thomson Discovered the first fundamental particle, the electron 
1898 | Pierre & Marie Curie Showed that thorium is radioactive which founded nuclear physics 
1900 | Planck Quantization E = hv explained the black-body spectrum 
1905 | Einstein Theory of special relativity 
1905 | Einstein Predicted the existence of the photon 
1906 | Einstein Used Planck’s constant to explain specific heats of solids 
| 1909 | Millikan The oil drop experiment measured the charge on the electron | 
1911 | Rutherford Discovered the atomic nucleus with radius 10° 5m 
1912 | Bohr Bohr model of the atom explained the quantized states of hydrogen 
Moseley X-ray spectra determined the atomic number of the elements. 
1915 | Millikan Used the photo-electric effect to confirm the photon hypothesis. 
Wilson-Sommerfeld Proposed quantization of the action-angle integral 
1921 | Stern-Gerlach Observed space quantization in non-uniform magnetic field 
1923 | Compton Compton scattering of x-rays confirmed the photon hypothesis 
1924 | de Broglie Postulated wave-particle duality for matter and EM waves 
1924 | Bohr Explicit statement of the correspondence principle 
Postulated the exclusion principle 
Postulated the spin of the electron of s = 3h 
Matrix mechanics representation of quantum theory 
Related Poisson brackets and commutation relations 
Schrédinger Wave mechanics 
G.P. Thomson/Davisson | Electron diffraction proved wave nature of electron 
1928 | Dirac Developed the Dirac relativistic wave equation 


18.2. BRIEF SUMMARY OF THE ORIGINS OF QUANTUM THEORY 495 


18.2.1 Bohr model of the atom 


The Rutherford scattering experiment, performed at Manchester in 1911, discovered that the Au atom 
comprised a positively charge nucleus of radius ~ 10714m which is much smaller than the 1.35 x 1071m 
radius of the Au atom. Stimulated by this discovery, Niels Bohr joined Rutherford at Manchester in 1912 
where he developed the Bohr model of the atom. This theory was remarkably successful in spite of having 
serious inconsistencies and deficiencies. Bohr’s model assumptions were: 

1) Electromagnetic radiation is quantized with E = hv. 

2) Electromagnetic radiation exhibits behavior characteristic of the emission of photons with energy 
E = hv and momentum p = py That is, it exhibits both wave-like and particle-like behavior. 

3) Electrons are in stationary orbits that do not radiate, which contradicts the predictions of classical 
electromagnetism. 

4) The orbits are quantized such that the electron angular momentum is an integer multiple of + =ħ. 

5) Atomic electromagnetic radiation is emitted with photon energy equal to the difference in binding 
energy between the two atomic levels involved. hy = Ej — Ea 

The first two assumptions are due to Planck and Einstein, while the last three were made by Niels Bohr. 

The deficiencies of the Bohr model were the philosophical problems of violating the tenets of classical 
physics in explaining hydrogen-like atoms, that is, the theory was prescriptive, not deductive. The Bohr 
model was based implicitly on the assumption that quantum theory contains classical mechanics as a limiting 
case. Bohr explicitly stated this assumption which he called the correspondence principle, and which 
played a pivotal role in the development of the older quantum theory. In 1924 Bohr justified the inconsis- 
tencies of the old quantum theory by writing “As frequently emphasized, these principles, although they 
are formulated by the help of classical conceptions, are to be regarded purely as laws of quantum theory, 
which give us, not withstanding the formal nature of quantum theory, a hope in the future of a consistent 
theory, which at the same time reproduces the characteristic features of quantum theory, important for its 
applicability, and, nevertheless, can be regarded as a rational generalization of classical electrodynamics.” 

The old quantum theory was remarkably successful in reproducing the black-body spectrum, specific heats 
of solids, the hydrogen atom, and the periodic table of the elements. Unfortunately, from a methodological 
point of view, the theory was a hodgepodge of hypotheses, principles, theorems, and computational recipes, 
rather than a logical consistent theory. Every problem was first solved in terms of classical mechanics, 
and then would pass through a mysterious quantization procedure involving the correspondence principle. 
Although built on the foundation of classical mechanics, it required Bohr’s hypotheses which violated the 
laws of classical mechanics and predictions of Maxwell’s equations. 


18.2.2 Quantization 


By 1912 Planck, and others, had abandoned the concept that quantum theory was a branch of classical 
mechanics, and were searching to see if classical mechanics was a special case of a more general quantum 
physics, or quantum physics was a science altogether outside of classical mechanics. Also they were trying 
to find a consistent and rational reason for quantization to replace the ad hoc assumption of Bohr. 
In 1912 Sommerfeld proposed that, in every elementary process, the atom gains or loses a definite amount 
of action between times to and t of i 
S= | L(t’)dt’ (18.3) 
to 
where S is the quantal analogue of the classical action function. It has been shown that the classical principle 
of least action states that the action function is stationary for small variations of the trajectory. In 1915 
Wilson and Sommerfeld recognized that the quantization of angular momentum could be expressed in terms 
of the action-angle integral, that is equation 15.116. They postulated that, for every coordinate, the action- 
angle variable is quantized 


f Pkdqk = nh (18.4) 


where the action-angle variable integral is over one complete period of the motion. That is, they postulated 
that Hamilton’s phase space is quantized, but the microscopic granularity is such that the quantization is 
only manifest for atomic-sized domains. That is, n is a small integer for atomic systems in contrast to 
n = 10% for the Earth-Sun two-body system. 
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Sommerfeld recognized that quantization of more than one degree of freedom is needed to obtain more 
accurate description of the hydrogen atom. Sommerfeld reproduced the experimental data by assuming 
quantization of the three degrees of freedom, 


f pear = nh food = noh f pede = n3h (18.5) 


and solving Hamilton-Jacobi theory by separation of variables. In 1916 the Bohr-Sommerfeld model solved 
the classical orbits for the hydrogen atom, including relativistic corrections as described in example 17.7. 
This reproduced fine structure observed in the optical spectra of hydrogen. The use of the canonical trans- 
formation to action-angle variables proved to be the ideal approach for solving many such problems in 
quantum mechanics. In 1921, Stern and Gerlach demonstrated space quantization by observing the splitting 
of atomic beams deflected by non-uniform magnetic fields. This result was a major triumph for quantum 
theory. Sommerfeld declared that “With their bold experimental method, Stern and Gerlach demonstrated 
not only the existence of space quantization, they also proved the atomic nature of the magnetic moment, 
its quantum-theoretic origin, and its relation to the atomic structure of electricity.” 

In 1925, Pauli’s Exclusion Principle proposed that no more than one electron can have identical quantum 
numbers and that the atomic electronic state is specified by four quantum numbers. Two students, Goudsmit 
and Uhlenbeck suggested that a fourth two-valued quantum number was the electron spin of +1. This 
provided a plausible explanation for the structure of multi-electron atoms. 


18.2.3 Wave-particle duality 


In his 1924 doctoral thesis, Prince Louis de Broglie proposed the hypothesis of wave-particle duality which 
was a pivotal development in quantum theory. de Broglie used the classical concept of a matter wavepacket, 
analogous to classical wave packets discussed in chapter 3.11. He assumed that both the group and signal 
velocities of a matter wave packet must equal the velocity of the corresponding particle. By analogy with 
Einstein's relation for the photon, and using the Theory of Special Relativity, de Broglie assumed that 
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hw = E = ——=— (18.6) 
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The group velocity is required to equal the velocity of the mass m 
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Integration of this equation assuming that k = 0 when v = 0, then gives 


This gives 


=== =p (18.9) 


(2) 


This relation, derived by de Broglie, is required to ensure that the particle travels at the group velocity 
of the wave packet characterizing the particle. Note that although the relations used to characterize the 
matter waves are purely classical, the physical content of such waves is beyond classical physics. In 1927 C. 
Davisson and G.P. Thomson independently observed electron diffraction confirming wave/particle duality for 
the electron. Ironically, J.J. Thomson discovered that the electron was a particle, whereas his son attributed 
it to an electron wave. 

Heisenberg developed the modern matrix formulation of quantum theory in 1925; he was 24 years old 
at the time. A few months later Schródinger's developed wave mechanics based on de Broglie’s concept of 
wave-particle duality. The matrix mechanics, and wave mechanics, quantum theories are radically different. 
Heisenberg’s algebraic approach employs non-commuting quantities and unfamiliar mathematical techniques 
that emphasized the discreteness characteristic of the corpuscle aspect. In contrast, Schródinger used the 
familiar analytical approach that is an extension of classical laws of motion and waves which stressed the 
element of continuity. 
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18.3 Hamiltonian in quantum theory 


18.3.1 Heisenberg's matrix-mechanics representation 


The algebraic Heisenberg representation of quantum theory is analogous to the algebraic Hamiltonian rep- 
resentation of classical mechanics, and shows best how quantum theory evolved from, and is related to, 
classical mechanics. Heisenberg decided to ignore the prevailing conceptual theories, such as classical me- 
chanics, and based his quantum theory on observables. This approach was influenced by the success of 
Bohr’s older quantum theory and Einstein’s theory of relativity. He abandoned the classical notions that 
the canonical variables pk, qk can be measured directly and simultaneously. Secondly he wished to absorb the 
correspondence principle directly into the theory instead of it being an ad hoc procedure tailored to each ap- 
plication. Heisenberg considered the Fourier decomposition of transition amplitudes between discrete states 
and found that the product of the conjugate variables do not commute. Heisenberg derived, for the first 
time, the correct energy levels of the one-dimensional harmonic oscillator as E, = hw(n + 3) which was a 
significant achievement. Born recognized that Heisenberg's strange multiplication and commutation rules for 
two variables, corresponded to matrix algebra. Prior to 1925, matrix algebra was an obscure branch of pure 
mathematics not known or used by the physics community. Heisenberg, Born, and the young mathemati- 
cian Jordan, developed the commutation rules of matrix mechanics. Heisenberg's approach represents the 
classical position and momentum coordinates q, p by matrices q and p, with corresponding matrix elements 
dmneemnt and Pine 2r?, Born showed that the trace of the matrix 


H(pq) = p4-L (18.10) 
gives the Hamiltonian function H(p, q) of the matrices q and p which leads to Hamilton’s canonical equations 
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Heisenberg and Born also showed that the commutator of q, p equals 


api —Pidk = thd x1 (18.12) 
a — qqk = 0 
Pkpi PIPk = 0 


Born realized that equation (18.12) is the only fundamental equation for introducing fi into the theory in a 
logical and consistent way. 

Chapter 15.2.4 discussed the formal correspondence between the Poisson bracket, defined in chapter 15.3, 
and the commutator in classical mechanics. It was shown that the commutator of two functions equals a 
constant multiplicative factor A times the corresponding Poisson Bracket. That is 


(F;Gy — GeF;) = À |F}, Gr] (18.13) 


where the multiplicative factor A is a number independent of F;, Gk, and the commutator. 

In 1925, Paul Dirac, a 23-year old graduate student at Bristol, recognized the crucial importance of 
the above correspondence between the commutator and the Poisson Bracket of two functions, to relating 
classical mechanics and quantum mechanics. Dirac noted that if the constant A is assigned the value A = th, 
then equation 18.13 directly relates Heisenberg’s commutation relations between the fundamental canonical 
variables (qj, px) to the corresponding classical Poisson Bracket [q;, px]. That is, 


ap — pide = ihlqw,pi] = thdx (18.14) 
dk — uqk = ihlqr,q] =0 (18.15) 
pkpi — pipke = ihlpx,pi] = 0 (18.16) 


Dirac recognized that the correspondence between the classical Poisson bracket, and quantum commuta- 
tor, given by equation (18.13), provides a logical and consistent way that builds quantization directly into 
the theory, rather than using an ad-hoc, case-dependent, hypothesis as used by the older quantum theory of 
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Bohr. The basis of Dirac’s quantization principle, involves replacing the classical Poisson Bracket, [F;, Gx] 
by the commutator, + (Fj, Gk — Gk Fj). That is, 


1 
[F;, Gr] => a EIC — GuF i) (18.17) 


Hamilton’s canonical equations, as introduced in chapter 15, are only applicable to classical mechanics 
since they assume that the exact position and conjugate momentum can be specified both exactly and 
simultaneously which contradicts the Heisenberg’s Uncertainty Principle. In contrast, the Poisson bracket 
generalization of Hamilton’s equations allows for non-commuting variables plus the corresponding uncertainty 
principle. That is, the transformation from classical mechanics to quantum mechanics can be accomplished 
simply by replacing the classical Poisson Bracket by the quantum commutator, as proposed by Dirac. The 
formal analogy between classical Hamiltonian mechanics, and the Heisenberg representation of quantum 
mechanics is strikingly apparent using the correspondence between the Poisson Bracket representation of 
Hamiltonian mechanics and Heisenberg’s matrix mechanics. 

The direct relation between the quantum commutator, and the corresponding classical Poisson Bracket, 
applies to many observables. For example, the quantum analogs of Hamilton’s equations of motion are 
given by use of Hamilton’s equations of motion, 15.53,15.56, and replacing each Poisson Bracket by the 
corresponding commutator. That is 


d OH 1 

mal == (ax, H] = a (qx H — Han) (18.18) 
d OH 1 

T Ss ge pr, H] = ih (pH — Hpx) (18.19) 


Chapter 15.2.5 discussed the time dependence of observables in Hamiltonian mechanics. Equation 15.45 
gave the total time derivative of any observable G to be 


dG OG 
— = — +ĪG, H 18.20 
T = Z + |G, H] (18.20) 
Equation 18.17 can be used to replace the Poisson Bracket by the quantum commutator, which gives the 
corresponding time dependence of observables in quantum physics. 


dG 0G 1 
= oe ty (GH HO) (18.21) 


In quantum mechanics, equation 18.21 is called the Heisenberg equation. Note that if the observable G is 
chosen to be a fundamental canonical variable, then Pak =0= ope and equation 15.20 reduces to Hamilton’s 
equations 18.18 and 18.19. 

The analogies between classical mechanics and quantum mechanics extend further. For example, if G is 


a constant of motion, that is dG = 0, then Heisenberg’s equation of motion gives 


ðG 1 
a + (GH - HG) =0 (18.22) 


Moreover, if G is not an explicit function of time, then 


0= = (GH - HG) (18.23) 
That is, the transition to quantum physics shows that, if G is a constant of motion, and is not explicitly 
time dependent, then G commutes with the Hamiltonian H. 

The above discussion has illustrated the close and beautiful correspondence between the Poisson Bracket 
representation of classical Hamiltonian mechanics, and the Heisenberg representation of quantum mechanics. 
Dirac provided the elegant and simple correspondence principle connecting the Poisson bracket representation 
of classical Hamiltonian mechanics, to the Heisenberg representation of quantum mechanics. 
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18.3.2 Schródinger?s wave-mechanics representation 


The wave mechanics formulation of quantum mechanics, by the Austrian theorist Schrödinger, was built on 
the wave-particle duality concept that was proposed in 1924 by Louis de Broglie. Schródinger developed 
his wave mechanics representation of quantum physics a year after the development of matrix mechanics 
by Heisenberg and Born. The Schrédinger wave equation is based on the non-relativistic Hamilton-Jacobi 
representation of a wave equation, melded with the operator formalism of Born and Wiener. The 39-year old 
Schrédinger was an expert in classical mechanics and wave theory, which was invaluable when he developed 
the important Schrödinger equation. As mentioned in chapter 15.4.4, the Hamilton-Jacobi theory is a 
formalism of classical mechanics that allows the motion of a particle to be represented by a wave. That is, 
the wavefronts are surfaces of constant action S, and the particle momenta are normal to these constant- 
action surfaces, that is, p = VS. The wave-particle duality of Hamilton-Jacobi theory is a natural way to 
handle the wave-particle duality proposed by de Broglie. 
Consider the classical Hamilton-Jacobi equation for one body, given by 18.20. 


A + H(q,VS,t) =0 (18.24) 


If the Hamiltonian is time independent, then equation 15.90 gives that 


Os 
p = H(A Pt) = -E (a) (18.25) 
The integration of the time dependence is trivial, and thus the action integral for a time-independent Hamil- 
tonian is 


A formal transformation gives 
Os 
E= ET p=VS (18.27) 


Consider that the classical time-independent Hamiltonian, for motion of a single particle, is represented 
by the Hamilton-Jacobi equation. 


2 
p Os 
H= =-= 18.2 
a + UM = (18.28) 
Substitute for p leads to the classical Hamilton-Jacobi relation in terms of the action S 
1 OS 


By analogy with the Hamilton-Jacobi equation, Schrédinger proposed the quantum operator equation 


OW» 
h— =H 18.30 
ino, = Hy (18.30) 
where A is an operator given by 
x h? 
ñ=-V*+U(r) (18.31) 
2p 


In 1926, Max Born and Norbert Wiener introduced the operator formalism into matrix mechanics for predic- 
tion of observables and this has become an integral part of quantum theory. In the operator formalism, the 
observables are represented by operators that project the corresponding observable from the wavefunction. 
That is, the quantum operator formalism for the assumed momentum and energy operators, that operate 
on the wavefunction %, are 


pj = =— E=--— (18.32) 


Formal transformations of p and E in the Hamiltonian (18.26) leads to the time-independent Schródinger 
equation 


7357 +U (q) = Ey (18.33) 
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Assume that the wavefunction is of the form 
Y = Act (18.34) 


where the action S gives the phase of the wavefront, and A the amplitude of the wave, as described in 
chapter 15.4.4. The time dependence, that characterizes the motion of the wavefront, is contained in the 
time dependence of S. This form for the wavefunction has the advantage that the wavefunction frequently 
factors into a product of terms, e.g. Y = R(r)O©(0)®(¢) which corresponds to a summation of the exponents 
S=W,+We+W,- Et. This summation form is exploited by separation of the variables, as discussed in 
chapter 15.4.3. 

Insert 4 (18.33) into equation (18.28) , plus using the fact that 


y ð (avas a (i Os 1 (081? i 0S 
— == — — — = — — — = — — — — — 1 a 
ae Tas (55 21) 25 (59%) o(a) + ap aerae 
leads to ag A A 
-2 = (VS: VS) +U(q) REA = E (18.36) 
Ot 2u 2u 


Note that if Planck’s constant ñ = 0, then the imaginary term in equation (18.35) is zero, leading to 18.35 
being real, and identical to the Hamilton-Jacobi result, equation 18.23. The fact that equation 18.35 
equals the Hamilton-Jacobi equation in the limit A — 0, illustrates the close analogy between the wave- 
particle duality of the classical Hamilton-Jacobi theory, and de Broglie’s wave-particle duality in Schrédinger’s 
quantum wave-mechanics representation. 

The Schrédinger approach was accepted in 1925 and exploited extensively with tremendous success, since 
it is much easier to grasp conceptually than is the algebraic approach of Heisenberg. Initially there was much 
conflict between the proponents of these two contradictory approaches, but this was resolved by Schrédinger 
who showed in 1926 that there is a formal mathematical identity between wave mechanics and matrix 
mechanics. That is, these quantal two representations of Hamiltonian mechanics are equivalent, even though 
they are built on either the Poisson bracket representation, or the Hamilton-Jacobi representation. Wave 
mechanics is based intimately on the quantization rule of the action variable. Heisenberg’s Uncertainty 
Principle is automatically satisfied by Schródinger's wave mechanics since the uncertainty principle is a 
feature of all wave motion, as described in chapter 3. 

In 1928 Dirac developed a relativistic wave equation which includes spin as an integral part. This Dirac 
equation remains the fundamental wave equation of quantum mechanics. Unfortunately it is difficult to 
apply. 

Today the powerful and efficient Heisenberg representation is the dominant approach used in the field of 
physics, whereas chemists tend to prefer the more intuitive Schrédinger wave mechanics approach. In either 
case, the important role of Hamiltonian mechanics in quantum theory is undeniable. 


18.4 Lagrangian representation in quantum theory 


The classical notion of canonical coordinates and momenta, has a simple quantum analog which has al- 
lowed the Hamiltonian theory of classical mechanics, that is based on canonical coordinates, to serve as the 
foundation for the development of quantum mechanics. The alternative Lagrangian formulation for classical 
dynamics is described in terms of coordinates and velocities, instead of coordinates and momenta. The La- 
grangian and Hamiltonian formulations are closely related, and it may appear that the Lagrangian approach 
is more fundamental. The Lagrangian method allows collecting together all the equations of motion and 
expressing them as stationary properties of the action integral, and thus it may appear desirable to base 
quantum mechanics on the Lagrangian theory of classical mechanics. Unfortunately, the Lagrangian equa- 
tions of motion involve partial derivatives with respect to coordinates, and their velocities, and the meaning 
ascribed to such derivatives is difficult in quantum mechanics. The close correspondence between Poisson 
brackets and the commutation rules leads naturally to Hamiltonian mechanics. However, Dirac showed that 
Lagrangian mechanics can be carried over to quantum mechanics using canonical transformations such that 
the classical Lagrangian is considered to be a function of coordinates at time t and t + dt rather than of 
coordinates and velocities. 
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The motivation for Feynman’s 1942 Ph.D thesis, entitled “The Principle of Least Action in Quantum 
Mechanics”, was to quantize the classical action at a distance in electrodynamics. This theory adopted an 
overall space-time viewpoint for which the classical Hamiltonian approach, as used in conventional formu- 
lations of quantum mechanics, is inapplicable. Feynman used the Lagrangian, plus the principle of least 
action, to underlie his development of quantum field theory. To paraphrase Feynman’s Nobel Lecture, he 
used a physical approach that is quite different from the customary Hamiltonian point of view for which the 
system is discussed in great detail as a function of time. That is, you have the field at this moment, then a 
differential equation gives you the field at a later moment and so on; that is, the Hamiltonian approach is a 
time differential method. In Feynman’s least-action approach the action describes the character of the path 
throughout all of space and time. The behavior of nature is determined by saying that the whole space-time 
path has a certain character. The use of action involves both advanced and retarded terms that make it 
difficult to transform back to the Hamiltonian form. The Feynman space-time approach is far beyond the 
scope of this course. This topic will be developed in advanced graduate courses on quantum field theory. 


18.5 Correspondence Principle 


The Correspondence Principle implies that any new theory in physics must reduce to preceding theories 
that have been proven to be valid. For example, Einstein’s Special Theory of Relativity satisfies the Corre- 
spondence Principle since it reduces to classical mechanics for velocities small compared with the velocity 
of light. Similarly, the General Theory of Relativity reduces to Newton’s Law of Gravitation in the limit 
of weak gravitational fields. Bohr’s Correspondence Principle requires that the predictions of quantum me- 
chanics must reproduce the predictions of classical physics in the limit of large quantum numbers. Bohr’s 
Correspondence Principle played a pivotal role in the development of the old quantum theory, from it’s 
inception in 1912, until 1925 when the old quantum theory was superseded by the current matrix and wave 
mechanics representations of quantum mechanics. 

Quantum theory now is a well-established field of physics that is equally as fundamental as is classical 
mechanics. The Correspondence Principle now is used to project out the analogous classical-mechanics 
phenomena that underlie the observed properties of quantal systems. For example, this book has studied 
the classical-mechanics analogs of the observed behavior for typical quantal systems, such as the vibrational 
and rotational modes of the molecule, and the vibrational modes of the crystalline lattice. The nucleus is the 
epitome of a many-body, strongly-interacting, quantal system. Example 14.12 showed that there is a close 
correspondence between classical-mechanics predictions, and quantal predictions, for both the rotational and 
vibrational collective modes of the nucleus, as well as for the single-particle motion of the nucleons in the 
nuclear mean field, such as the onset of Coriolis-induced alignment. This use of the Correspondence Principle 
can provide considerable insight into the underlying classical physics embedded in quantal systems. 
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18.6 Summary 


The important point of this discussion is that variational formulations of classical mechanics provide a 
rational, and direct basis, for the development of quantum mechanics. It has been shown that the final form 
of quantum mechanics is closely related to the Hamiltonian formulation of classical mechanics. Quantum 
mechanics supersedes classical mechanics as the fundamental theory of mechanics in that classical mechanics 
only applies for situations where quantization is unimportant, and is the limiting case of quantum mechanics 
when fi — 0, which is in agreement with the Bohr’s Correspondence Principle. The Dirac relativistic theory 
of quantum mechanics is the ultimate quantal theory for the relativistic regime. 

This discussion has barely scratched the surface of the correspondence between classical and quantal 
mechanics, which goes far beyond the scope of this course. The goal of this chapter is to illustrate that 
classical mechanics, in particular, Hamiltonian mechanics, underlies much of what you will learn in your 
quantum physics courses. An interesting similarity between quantum mechanics and classical mechanics is 
that physicists usually use the more visual Schródinger wave representation in order to describe quantum 
physics to the non-expert, which is analogous to the similar use of Newtonian physics in classical mechan- 
ics. However, practicing physicists invariably use the more abstract Heisenberg matrix mechanics to solve 
problems in quantum mechanics, analogous to widespread use of the variational approach in classical me- 
chanics, because the analytical approaches are more powerful and have fundamental advantages. Quantal 
problems in molecular, atomic, nuclear, and subnuclear systems, usually involve finding the normal modes 
of a quantal system, that is, finding the eigen-energies, eigen-functions, spin, parity, and other observables 
for the discrete quantized levels. Solving the equations of motion for the modes of quantal systems is sim- 
ilar to solving the many-body coupled-oscillator problem in classical mechanics, where it was shown that 
use of matrix mechanics is the most powerful representation. It is ironic that the introduction of matrix 
methods to classical mechanics is a by-product of the development of matrix mechanics by Heisenberg, Born 
and Jordan. This illustrates that classical mechanics not only played a pivotal role in the development of 
quantum mechanics, but it also has benefitted considerably from the development of quantum mechanics; 
that is, the synergistic relation between these two complementary branches of physics has been beneficial to 
both classical and quantum mechanics. 


Recommended reading 
“Quantum Mechanics” by P.A.M. Dirac, Oxford Press, 1947, 
“Conceptual Development of Quantum Mechanics” by Max Jammer, Mc Graw Hill 1966. 


Chapter 19 


Epilogue 


Hamilton's action principle 


Hamiltonian | E | Lagrangian 


Solution for motion Initial conditions 


Figure 19.1: Philosophical road map of the hierarchy of stages involved in analytical mechanics. Hamilton's 
Action Principle is the foundation of analytical mechanics. Stage 1 uses Hamilton's Principle to derive the 
Lagranian and Hamiltonian. Stage 2 uses either the Lagrangian or Hamiltonian to derive the equations 
of motion for the system. Stage 3 uses these equations of motion to solve for the actual motion using 
the assumed initial conditions. The Lagrangian approach can be derived directly based on d’Alembert’s 
Principle. Newtonian mechanics can be derived directly based on Newton's Laws of Motion. 


d'Alembert's Principle 


This book has introduced powerful analytical methods in physics that are based on applications of 
variational principles to Hamilton's Action Principle. These methods were pioneered in classical mechanics 
by Leibniz, Lagrange, Euler, Hamilton, and Jacobi, during the remarkable Age of Enlightenment, and reached 
full fruition at the start of the 20'” century. 

The philosophical roadmap, shown above, illustrates the hierarchy of philosophical approaches available 
when using Hamilton's Action Principle to derive the equations of motion of a system. The primary Stagel 
uses Hamilton's Action functional, S = f “i L(q, 4,t)dt to derive the Lagrangian, and Hamiltonian function- 
als. Stagel provides the most fundamental and sophisticated level of understanding and involves specifying 
all the active degrees of freedom, as well as the interactions involved. Stage2 uses the Lagrangian or Hamil- 
tonian functionals, derived at Stagel, in order to derive the equations of motion for the system of interest. 
Stage3 then uses the derived equations of motion to solve for the motion of the system, subject to a given 
set of initial boundary conditions. 

Newton postulated equations of motion for nonrelativistic classical mechanics that are identical to those 
derived by applying variational principles to Hamilton’s Principle. However, Newton’s Laws of Motion are 
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applicable only to nonrelativistic classical mechanics, and cannot exploit the advantages of using the more 
fundamental Hamilton’s Action Principle, Lagrangian, and Hamiltonian. Newtonian mechanics requires that 
all the active forces be included in the equations of motion, and involves dealing with vector quantities which 
is more difficult than using the scalar functionals, action, Lagrangian, or Hamiltonian. Lagrangian mechanics 
based on d’Alembert’s Principle does not exploit the advantages provided by Hamilton’s Action Principle. 

Considerable advantages result from deriving the equations of motion based on Hamilton’s Principle, 
rather than basing them on the Newton’s postulated Laws of Motion. It is significantly easier to use varia- 
tional principles to handle the scalar functionals, action, Lagrangian, and Hamiltonian, rather than starting 
with Newton’s vector differential equations-of-motion. The three hierarchical stages of analytical mechanics 
facilitate accommodating extra degrees of freedom, symmetries, constraints, and other interactions. For 
example, the symmetries identified by Noether’s theorem are more easily recognized during the primary “ac- 
tion” and secondary “Hamiltonian/Lagrangian” stages, rather than at the subsequent “equations-of-motion” 
stage. Constraint forces, and approximations, introduced at the Stagel or Stage2, are easier to implement 
than at the subsequent Stage3. The correspondence of Hamilton’s Action in classical and quantal mechan- 
ics, as well as relativistic invariance, are crucial advantages for using the analytical approach in relativistic 
mechanics, fluid motion, quantum, and field theory. 

Philosophically, Newtonian mechanics is straightforward to understand since it uses vector differential 
equations of motion that relate the instantaneous forces to the instantaneous accelerations. Moreover, 
the concepts of momentum plus force are intuitive to visualize, and both cause and effect are embedded 
in Newtonian mechanics. Unfortunately, Newtonian mechanics is incompatible with quantum physics, it 
violates the relativistic concepts of space-time, and fails to provide the unified description of the gravitational 
force plus planetary motion as geodesic motion in a four-dimensional Riemannian structure. 

The remarkable philosophical implications embedded in applying variational principles to Hamilton’s 
Principle, are based on the astonishing assumption that motion of a constrained system in nature follows 
a path that minimizes the action integral. As a consequence, solving the equations of motion is reduced 
to finding the optimum path that minimizes the action integral. The fact that nature follows optimization 
principles is nonintuitive, and was considered to be metaphysical by many scientists and philosophers during 
the 19% century, which delayed full acceptance of analytical mechanics until the development of the Theory 
of Relativity and quantum mechanics. Variational formulations now have become the preeminent approach 
in modern physics and they have toppled Newtonian mechanics from the throne of classical mechanics that 
it occupied for two centuries. 

The scope of this book extends beyond the typical classical mechanics textbook in order to illustrate 
how Lagrangian and Hamiltonian dynamics provides the foundation upon which modern physics is built. 
Knowledge of analytical mechanics is essential for the study of modern physics. The techniques and physics 
discussed in this book reappear in different guises in many fields, but the basic physics is unchanged illustrat- 
ing the intellectual beauty, the philosophical implications, and the unity of the field of physics. The breadth 
of physics addressed by variational principles in classical mechanics, and the underlying unity of the field, 
are epitomized by the wide range of dimensions, energies, and complexity involved. The dimensions range 
from as large as 10?’m, to quantal analogues of classical mechanics of systems spanning in size down to the 
Planck length of 1.62 x 10735m. Individual particles have been detected with kinetic energies ranging from 
zero to greater than 101% eV. The complexity of classical mechanics spans from one body to the statistical 
mechanics of many-body systems. As a consequence, analytical variational methods have become the pre- 
mier approach to describe systems from the very largest to the smallest, and from one-body to many-body 
dynamical systems. 

The goal of this book has been to illustrate the astonishing power of analytical variational methods for 
understanding the physics underlying classical mechanics, as well as extensions to modern physics. However, 
the present narrative remains unfinished in that fundamental philosophical and technical questions have 
not been addressed. For example, analytical mechanics is based on the validity of the assumed principle of 
economy. This book has not addressed the philosophical question, “is the principle of economy a fundamental 
law of nature, or is it a fortuitous consequence of the fundamental laws of nature?” 

In summary, Hamilton’s action principle, which is built into Lagrangian and Hamiltonian mechanics, 
coupled with the availability of a wide arsenal of variational principles and mathematical techniques, provides 
a remarkably powerful approach for deriving the equations of motions required to determine the response of 
systems in a broad and diverse range of applications in science and engineering. 


Appendix A 


Matrix algebra 


A.1 Mathematical methods for mechanics 


Development of classical mechanics has involved a close and synergistic interweaving of physics and mathe- 
matics, that continues to play a key role in these fields. The concepts of scalar and vector fields play a pivotal 
role in describing the force fields and particle motion in both the Newtonian formulation of classical mechan- 
ics and electromagnetism. Thus it is imperative that you be familiar with the sophisticated mathematical 
formalism used to treat multivariate scalar and vector fields in classical mechanics. Ordinary and partial 
differential equations up to second order, as well as integration of algebraic and trigonometric functions play 
a major role in classical mechanics. It is assumed that you already have a working knowledge of differential 
and integral calculus in sufficient depth to handle this material. Computer codes, such as Mathematica, 
MatLab, and Maple, or symbolic calculators, can be used to obtain mathematical solutions for complicated 
cases. 

The following 9 appendices provide brief summaries of matrix algebra, vector algebra, orthogonal co- 
ordinate systems, coordinate transformations, tensor algebra, multivariate calculus, vector differential plus 
integral calculus, Fourier analysis and time-sampled waveform analysis. The manipulation of scalar and 
vector fields is greatly facilitated by transforming to orthogonal curvilinear coordinate systems that match 
the symmetries of the problem. These appendices discuss the necessity to account for the time dependence 
of the orthogonal unit vectors for curvilinear coordinate systems. It is assumed that, except for coordinate 
transformations and tensor algebra, you have been introduced to these topics in linear algebra and other 
physics courses, and thus the purpose of these appendices is to serve as a reference and brief review. 


A.2 Matrices 


Matrix algebra provides an elegant and powerful representation of multivariate operators, and coordinate 
transformations that feature prominently in classical mechanics. For example they play a pivotal role in 
finding the eigenvalues and eigenfunctions for coupled equations that occur in rigid-body rotation, and 
coupled oscillator systems. An understanding of the role of matrix mechanics in classical mechanics facilitates 
understanding of the equally important role played by matrix mechanics in quantal physics. 

It is interesting that although determinants were used by physicists in the late 19'” century, the concept 
of matrix algebra was developed by Arthur Cayley in England in 1855, but many of these ideas were the work 
of Hamilton, and the discussion of matrix algebra was buried in a more general discussion of determinants. 
Matrix algebra was an esoteric branch of mathematics, little known by the physics community, until 1925 
when Heisenberg proposed his innovative new quantum theory. The striking feature of this new theory 
was its representation of physical quantities by sets of time-dependent complex numbers and a peculiar 
multiplication rule. Max Born recognized that Heisenberg's multiplication rule is just the standard “row 
times column” multiplication rule of matrix algebra; a topic that he had encountered as a young student in a 
mathematics course. In 1924 Richard Courant had just completed the first volume of the new text Methods 
of Mathematical Physics during which Pascual Jordan had served as his young assistant working on matrix 
manipulation. Fortuitously, Jordan and Born happened to share a carriage on a train to Hanover during 
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which Jordan overheard Born talk about his problems trying to work with matrices. Jordan introduced 
himself to Born and offered to help. This led to publication, in September 1925, of the famous Born-Jordan 
paper[Bor25a] that gave the first rigorous formulation of matrix mechanics in physics. This was followed in 
November by the Born-Heisenberg-Jordan sequel[Bor25b] that established a logical consistent general method 
for solving matrix mechanics problems plus a connection between the mathematics of matrix mechanics and 
linear algebra. Matrix algebra developed into an important tool in mathematics and physics during World 
War 2 and now it is an integral part of undergraduate linear algebra courses. 

Most applications of matrix algebra in this book are restricted to real, symmetric, square matrices. The 
size of a matrix is defined by the rank, which equals the row rank and column rank, i.e. the number of 
independent row vectors or column vectors in the square matrix. It is presumed that you have studied 
matrices in a linear algebra course. Thus the goal of this review is to list simple manipulation of symmetric 
matrices and matrix diagonalization that will be used in this course. You are referred to a linear algebra 
textbook if you need further details. 


Matrix definition 


A matrix is a rectangular array of numbers with M rows and N columns. The notation used for an element 
of a matrix is Aj; where 7 designates the row and j designates the column of this matrix element in the 
matrix A. Convention denotes a matrix A as 


Ait A12 ise Ai(n-1) Ain 
A21 A22 . Ag(N-1) Aon 
A= i : Aij : : (A.1) 
Amon Ana + Agiw-y A(-1)N 
Ami Am2 P Am(N-1) Amn 


Matrices can be square, M = N, or rectangular M 4 N. Matrices having only one row or column are 
called row or column vectors respectively, and need only a single subscript label. For example, 


Matrix manipulation 


Matrices are defined to obey certain rules for matrix manipulation as given below. 
1) Multiplication of a matrix by a scalar A simply multiplies each matrix element by A. 


Cij = Mij (A.3) 
2) Addition of two matrices A and B having the same rank, i.e. the number of columns, is given by 


3) Multiplication of a matrix A by a matrix B is defined only if the number of columns in A equals the 
number of rows in B. The product matrix C is given by the matriz product 


C=A-B (A.5) 
Cij = [AB],; = S > Aik Br (A.6) 
k 
For example, if both A and B are rank three symmetric matrices then 
Ait As Ais Bıı Biz Bis 
C = A-B=| Ag Ago Ags Ba B22 Bag 
As A32 Ass Ba Bs. B33 


A11B11 + Ar2B21 + A13B31 Ai Big + A12B22 + A13B32 A11B13 + A12B23 + A13B33 
= Ag, By + Az2B21 + A23B31 A21B12+ A22B22 + A23B32 A21B13 + A22B23 + A423B33 
Az1B11 + Azg2Ba1 + A33B31 A31 Big + Az2Bo2 + Az3B32 A31-B13 + Az2Bo3 + A33B33 
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In general, multiplication of matrices A and B is noncommutative, i.e. 
A.BXB-.A (A.T) 


In the special case when A - B = B - A then the matrices are said to commute. 


Transposed matrix AT 
The transpose of a matrix A will be denoted by AT and is given by interchanging rows and columns, that is 
(4%), = Aji (A.8) 


The transpose of a column vector is a row vector. Note that older texts use the symbol A for the transpose. 


Identity (unity) matrix I 


The identity (unity) matrix I is diagonal with diagonal elements equal to 1, that is 


lij = 04; (A.9) 
where the Kronecker delta symbol is defined by 
on = 0 ifi £k (A.10) 
1 ifi=k 


Inverse matrix A”! 


If a matrix is non-singular, that is, its determinant is non-zero, then it is possible to define an inverse matrix 
A7!. A square matrix has an inverse matrix for which the product 


A.A7?=I (A.11) 


Orthogonal matrix 
A matrix with real elements is orthogonal if 
AT=A 7? (A.12) 


That is 
y (5) a Aj = do Abi Ay = ĝij (A.13) 


k k 


Adjoint matrix At 


For a matrix with complex elements, the adjoint matrix, denoted by At is defined as the transpose of the 
complex conjugate 


(At); = A% (A.14) 


ij 
Hermitian matrix 


The Hermitian conjugate of a complex matrix H is denoted as HÝ and is defined as 


Ht = (H")" = (H")" (A.15) 
Therefore ie is ce 
ij ji A 
A matrix is Hermitian if it is equal to its adjoint 
Hİ =H (A.17) 
that is 
H}, = = Hij (A.18) 


A matrix that is both Hermitian and has real elements is a symmetric matrix since complex conjugation has 
no effect. 
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Unitary matrix 
A matrix with complex elements is unitary if its inverse is equal to the adjoint matrix 

Ut = U~! (A.19) 
which is equivalent to 

UU =I (A.20) 


A unitary matrix with real elements is an orthogonal matrix as given in equation A.12. 


Trace of a square matrix TrA 


The trace of a square matrix, denoted by TrA, is defined as the sum of the diagonal matrix elements. 


N 
TrA =)Y Ai (A.21) 
1=1 


Inner product of column vectors 


Real vectors The generalization of the scalar (dot) product in Euclidean space is called the inner prod- 
uct. Exploiting the rules of matrix multiplication requires taking the transpose of the first column vector 
to form a row vector which then is multiplied by the second column vector using the conventional rules for 
matrix multiplication. That is, for rank N vectors 


Xi Yı Yı N 
X Y; Y; 

X]-f¥]=] 7? p7 =[X] [Y]=(Xı X . Xy) ? | =XY (A.22) 
Xn Yy Yy i 


For rank N = 3 this inner product agrees with the conventional definition of the scalar product and gives a 
result that is a scalar. For the special case when [A]. [B] = 0 then the two matrices are called orthogonal. 
The magnitude squared of a column vector is given by the inner product 


N 
[X]: [X] = $ (x: = 0 (A.23) 
Note that this is only positive. 


Complex vectors For vectors having complex matrix elements the inner product is generalized to a form 
that is consistent with equation A.22 when the column vector matrix elements are real. 


Yı 
Y, N 
Rem- = ana e e Ple E NA (A.24) 
Yw-1 i=1 
Yn 


For the special case 


x(x] = XI! IX) =D XX; > (A.25) 
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A.3 Determinants 


Definition 


The determinant of a square matrix with N rows equals a single number derived using the matrix elements 
of the matrix. The determinant is denoted as det A or |A| where 


N 
A] => e (ft jz, jN) Aij Ang Ang (A.26) 


j=1 


where e(71,j2,....jn) is the permutation index which is either even or odd depending on the number of 
permutations required to go from the normal order (1, 2,3,...N) to the sequence (j1J273..-JN)- 
For example for N = 3 the determinant is 


|A| = 411422433 + 412423431 + 413491432 — 413422431 — 411423432 — 412491433 (A.27) 


Properties 


1. The value of a determinant |A| = 0, if 


(a) all elements of a row (column) are zero. 


(b) all elements of a row (column) are identical with, or multiples of, the corresponding elements of 
another row (column). 


2. The value of a determinant is unchanged if 


(a) rows and columns are interchanged. 


(b) a linear combination of any number of rows is added to any one row. 


3. The value of a determinant changes sign if two rows, or any two columns, are interchanged. 
4. Transposing a square matrix does not change its determinant. |A7| =|A] 


5. If any row (column) is multiplied by a constant factor then the value of the determinant is multiplied 
by the same factor. 


6. The determinant of a diagonal matrix equals the product of the diagonal matrix elements. That is, 
when Aij = Aiðij then A] = A1A243...Aw 


7. The determinant of the identity (unity) matrix |I| = 1. 
8. The determinant of the null matrix, for which all matrix elements are zero, |0| = 0 
9. A singular matrix has a determinant equal to zero. 
10. If each element of any row (column) appears as the sum (difference) of two or more quantities, then 


the determinant can be written as a sum (difference) of two or more determinants of the same order. 
For example for order N = 2, 


Ay Bı Ar B2 
A21 A22 


Bıı By 
Agi Ago 


11 A determinant of a matrix product equals the product of the determinants. That is, if C = AB then 
|C| = |A| |B| 
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Cofactor of a square matrix 


For a square matrix having N rows the cofactor is obtained by removing the i*” row and the jt” column 
and then collapsing the remaining matrix elements into a square matrix with N — 1 rows while preserving 
the order of the matrix elements. This is called the complementary minor which is denoted as A“). The 
matrix elements of the cofactor square matrix a are obtained by multiplying the determinant of the (ij) 
complementary minor by the phase factor (-1)%?. That is 


a,j = (Ey |a| (A.28) 
The cofactor matrix has the property that 


N N 
S Aikajk = di; A] = SO Arians (A.29) 
k=1 


k=1 


Cofactors are used to expand the determinant of a square matrix in order to evaluate the determinant. 


Inverse of a non-singular matrix 


The (i, j) matrix elements of the inverse matrix AT! of a non-singular matrix A are given by the ratio of 
the cofactor aj; and the determinant |A|, that is 


1 
AS = —— aji A. 
2) |A| Qj ( 30) 
Equations A.28 and 4.29 can be used to evaluate the i, j element of the matrix product (A~1A) 
x Dye 1 
(ATA), = 2 Ain Aus = TA] D = jaj’ |A] = 055 = Lij (A.31) 


This agrees with equation All that A- A7* =I. 

The inverse of rank 2 or 3 matrices is required frequently when determining the eigen-solutions for rigid- 
body rotation, or coupled oscillator, problems in classical mechanics as described in chapters 11 and 12. 
Therefore it is convenient to list explicitly the inverse matrices for both rank 2 and rank 3 matrices. 


Inverse for rank 2 matrices: 


ello tlale] am 


where the determinant of A is written explicitly in equation A32. 


a 


Inverse for rank 3 matrices: 


a b c 1 A BO 1 A DG 
at= [def] = [DEF] = |B EH 
ghi Alla a 1 IAL | c I 
1 A=(ei- fh) D=-(bi-ch) G=(bf—ce) 
= ————]| B=-(di- fg) = (ai— cg) H=-—(af-—cd) (A.33) 
aA+bB+cC'| Cx (dh—eg) F=-(ah—bg) I= (ae—bd) 


where the functions A, B,C, D, E, F,G, H, I, are equal to rank 2 determinants listed in equation A33. 


A.4. REDUCTION OF A MATRIX TO DIAGONAL FORM 511 


A.4 Reduction of a matrix to diagonal form 


Solving coupled linear equations can be reduced to diagonalization of a matrix. Consider the matrix A 
operating on the vector X to produce a vector Y, that are expressed as components with respect to the 
unprimed coordinate frame, i.e. 

A-X=Y (A.34) 


Consider that the unitary real matrix R with rank n, rotates the n-dimensional un-primed coordinate 
frame into the primed coordinate frame such that A , X and Y are transformed to A’ , X’ and Y” in the 
rotated primed coordinate frame. Then 


X = RX 
Y = R-Y (A.35) 


With respect to the primed coordinate frame equation (A.34) becomes 


R(A-X) = R-Y (A.36) 
R-A-R'-R-X = RY (A.37) 
R-A-R?.X’ = A- X=Y (A.38) 


using the fact that the identity matrix I= R-R~! = R- R7 since the rotation matrix in n dimensions is 
orthogonal. 
Thus we have that the rotated matrix 


A’=R-A-R™ (A.39) 


Let us assume that this transformed matrix is diagonal, then it can be written as the product of the unit 
matrix I and a vector of scalar numbers called the characteristic roots À as 


A’=R-A-R? =X (A.40) 
using the fact that RT= R! then gives 
RT . (Al) = AR? (A.41) 
Let both sides of equation A.41 act on X’ which gives 
AL-X'= A’-X’ (A.42) 


or 
[AI-A‘] X’= 0 (A.43) 


This represents a set of n homogeneous linear algebraic equations in n unknowns X’ where A is a set of 
characteristic roots, (eigenvalues) with corresponding eigenfunctions X”. Ignoring the trivial case of X’ being 
zero, then (A.43) requires that the secular determinant of the bracket be zero, that is 


|AI-A’| = 0 (A.44) 


The determinant can be expanded and factored into the form 


(A — A1) (A — Az) (A = Ag) .... (A— An) = 0 (A.45) 


where the n eigenvalues are \ = A1,A2,...An of the matrix A’. 
The eigenvectors X’ corresponding to each eigenvalue are determined by substituting a given eigenvalue 
A; into the relation 
XT. AX’ = [Vidi] (A.46) 


If all the eigenvalues are distinct, i.e. different, then this set of n equations completely determines the ratio 
of the components of each eigenvector along the axes of the coordinate frame. However, when two or more 
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eigenvalues are identical, then the reduction to a true diagonal form is not possible and one has the freedom 
to select an appropriate eigenvector that is orthogonal to the remaining axes. 

In summary, the matrix can only be fully diagonalized if (a) all the eigenvalues are distinct, (b) the real 
matrix is symmetric, (c) it is unitary. 

A frequent application of matrices in classical mechanics is for solving a system of homogeneous linear 
equations of the form 


Azı HAt ...... +AinlIn = 0 
Azı +Ayortq wees. +Aintn a 0 (A.47) 
A aon Abe > 0) 
Making the following definitions 
Ay, Ai .. Ain 
Ma Ag, Az... Aan (A.48) 
Aa Nae a Bi 
Ti 
| ”? (A.49) 
Tn 
Then the set of linear equations can be written in a compact form using the matrices 
A-X=0 (A.50) 


which can be solved using equation (A.43). Ensure that you are able to diagonalize a matrices with rank 
2 and 3. You can use Mathematica, Maple, MatLab, or other such mathematical computer programs to 
diagonalize larger matrices. 


A.1 Example: Eigenvalues and eigenvectors of a real symmetric matriz 


Consider the matrix 


The secular determinant is given by (A.42) 


=A 1 0 
1 -A 0 |=0 
0 =A 


This expands to 
—A(A+ DA-1)=0 


Thus the three eigen values are A = —1,0, 1. 
To find each eigenvectors we substitute the corresponding eigenvalue into equation (A.48). 


=A 1 0 x 0 
1 -A 0 y |= | 0 
0 0 A z 0 


The eigenvalue A = —1 yields x + y = 0 and z = 0. Thus the eigen vector is ry = (a 7 ). The 


eigenvalue A = 0 yields x = 0 and y = 0. Thus the eigen vector is ra = (0,0,1). The eigenvalue A = 1 
yields —x + y = 0 and z = 0. Thus the eigen vector is r3 = (Fs: 7:0)- The orthogonality of these three 
eigen vectors, which correspond to three distinct eigenvalues, can be verified. 
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A.2 Example: Degenerate eigenvalues of real symmetric matrix 


This example illustrates how to generate eigenvectors corresponding to degenerate eigenvalues. Consider 
the matrix 


1 0 0 
A=; 0 0 1 
0 1 0 


The secular determinant is given by (A.42) 


1-A 0 0 
0 =A 1 |=0 
0 T- =A 


This expands to 
(1-=»JA+DA-1)=0 


Thus the three eigen values are A = —1, 1,1. 
The eigenvectors are determined by substituting the corresponding eigenvalue into equation (A.42). 


1-A 0 0 x 0 
0 -r’ 1 |-ly]=l0 
0 0 AÀ z 0 


The eigenvalue X = —1 yields 2x = 0 and y +z = 0. Thus the eigen vector is rı = (0, 7 =): The 
eigenvalue X = 1 yields — y+z=0. The eigenvector ra must be perpendicular to rı and there are an infinite 
number of choices. Let us assume that r2 = (0, Not 73) which satisfies equation (A.50) then the eigenvector 
r3 must be perpendicular to both rı and rg. For rank three this is found using 


r3 =r; X r2 = (1,0,0) 
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Appendix B 


Vector algebra 


B.1 Linear operations 


The important force fields in classical mechanics, namely, gravitation, electric, and magnetic, are vector 
fields that have a position-dependent magnitude and direction. Thus, it is useful to summarize the algebra 
of vector fields. 

A vector a has both a magnitude |a| and a direction defined by the unit vector êa, that is, the vector 


can be written as a bold character a where 
a =a: êa (B.1) 


where by convention the implied modulus sign is omitted. The hat symbol on the vector €, designates that 
this is a unit vector with modulus |é,| = 1. 

Vector force fields are assumed to be linear, and consequently they obey the principle of superposition, 
are commutative, associative, and distributive as illustrated below for three vectors a,b,c plus a scalar 
multiplier y. 


a+b = tb+a (B.2) 
a+(b+c) = (a+b)+c 
y(at+b) = ya+yb 


The manipulation of vectors is greatly facilitated by use of components along an orthogonal coordinate 
system defined by three orthogonal unit vectors (€1, 62, êz) . For example the cartesian coordinate system 
is defined by three unit vectors which, by convention, are called (i,j, k). 


B.2 Scalar product 


Multiplication of two vectors can produce a 9—component tensor that can be represented by a 3 x 3 matrix 
as discussed in appendix E. There are two special cases for vector multiplication that are important for 
vector algebra; the first is the scalar product, and the second is the vector product. 

The scalar product of two vectors is defined to be 


a- b = |a| |b| cos 0 (B.3) 


where 0 is the angle between the two vectors. It is a scalar and thus is independent of the orientation of 
the coordinate axis system. Note that the scalar product commutes, is distributive, and associative with a 
scalar multiplier, that is 


a-b = b-a (B.4) 
a(b+c) = a-b+a-c 
Qa)b = A(b:a) 


Note that a- a =]|al? and if a and b are perpendicular then cos = 0 and thus a- b =0 


515 


516 APPENDIX B. VECTOR ALGEBRA 


If the three unit vectors (61, 62,63) form an orthonormal basis, that is, they are orthogonal unit vectors, 
then from equations B.3 and B.4 
€: 6, = Îik (B.5) 


If â is the unit vector for the vector a then the scalar product of a vector a with one of these unit vectors 
ên gives the cosine of the angle between the vector a and ên, that is 


a-é, = lal(á-€,) = |a| cosa (B.6) 
ê = |a|(A-@) =la] cos 
és la] (â - €3) = |a| cos y 


where the cosines are called the direction cosines since they define the direction of the vector a with respect 
to each orthogonal basis unit vector. Moreover, a - ê = |a| â - €, = |a| cosa is the component of a along the 
ê axis. Thus the three components of the vector a is fully defined by the magnitude |a| and the direction 
cosines, corresponding to the angles a, 3,y. That is, 


a, = jaļ|(â- ê) = |a| cosa (B.7) 
az = |al (A: é2) = |a| cos 8 
az = |al (4-3) = |a| cos y 


If the three unit vectors (6, ,é2,€3) form an orthonormal basis then the vector is fully defined by 


a= a, + a989 + a383 (B.8) 
Consider two vectors 
a = 01€ + azê2 + a3é3 
b = 016; + b2ê2 + b383 
Then using B.5 
a- b =a ,b, + asba + a3b3 = la] |b| cos @ (B.9) 


where 0 is the angle between the two vectors. In particular, since the direction cosine cos Qa = iat , then 
equation B.9 gives 
cos 0 = COS Qa COS A, + Cos B, COS Bp + COS Ya COS Yp (B.10) 


Note that when 0 = 0 then B.10 gives 


cos? a + cos? 6 + cos? y = 1 (B.11) 


B.3 Vector product 


The vector product of two vectors is defined to be 
c=ax b = |a| |b| sin 05 (B.12) 


where 0 is the angle between the vectors and fi is a unit vector perpendicular to the plane defined by a 
and b such that the unit vectors (a, b, â) obey a right-handed screw rule. The vector product acts like a 


pseudovector which comprises a normal vector multiplied by a sign factor that depends on the handedness 
of the system as described in appendix D.3. 
The components of c are defined by the relation 


C= X cijnay be (B.13) 
jk 


where the (Levi-Civita) permutation symbol ¢;;, has the following properties 


Eijk =O if an index is equal to any another index 
Eijk = +1 if i,j,k, form an even permutation of 1,2,3 (B.14) 
Eijk = —1 if t,j,k, form an odd permutation of 1,2, 3 
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For example, if the three unit vectors (€, és, €3) form an orthonormal basis, then ê; = ik Eijkêjêk, i.e. 


êi x êz = 83 êz x 83 = êi 63 x êi = êz (B.15) 
82 x êi = —é3 83 x SS) = —é, êi x 63 = —€o (B.16) 
êi x êi = 0 69 x és =0 83 x êo =0 (B.17) 
The vector product anticommutes in that 
axb=-bxa (B.18) 
However, it is distributive and associative with a scalar multiplier 
ax(b+c) = axb+axc (B.19) 
(Aa) xb = A(axb) (B.20) 
Note that when sin 0 = 0 then a x b = 0 and in particular, a x a= 0. 
Consider two vectors 
a = ae, + 989 + 0363 
b = 016; + b2ê2 + b383 
Then using equations B.12 and B.15— B.17 
€: êz 63 
ax b=]la||blsin0 =| a1 as az |= ê (a2b3 — azba) + és (azbı — a1b3) + êz (a1b2 — a2b1) 


bi bz bs 


where @ is the angle between the two vectors and the determinant is evaluated for the top row. Examples of 
vector products are torque N = r x F, angular momentum L = r x p, and the magnetic force Fg = qv x B. 


B.4 Triple products 


The following scalar and vector triple products can be formed from the product of three vectors and are 
used frequently. 
Scalar triple products 
There are several permutations of scalar triple products of three vectors [a,b,c] that are identical. 
a (b x c) = c (ax b) = b (c x a) = (a x b)- c = —a (c x b) (B.21) 


That is, the scalar product is invariant to cyclic permutations of the three vectors but changes sign for 
interchange of two vectors. The scalar product is unchanged by swapping the scalar (dot)and vector (cross). 
Because of the symmetry the scalar triple product can be denoted as [a, b, c] and 


a,b,c] > 0 if [a,b,c] is right-handed 
[a,b,c] = 0 if [a,b,c] is coplanar (B.22) 
[a,b,c] < 0 if [a,b,c] is left-handed 


The scalar triple product can be written in terms of the components using a determinant 


ay a2 a3 
la, b, c] = by ba b3 (B.23) 
Ci C2 C3 
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Vector triple product 


The vector triple product ax (b x c) is a vector. Since (b x c) is perpendicular to the plane of b,c, then 
ax (b x c) must lie in the plane containing b,c. Therefore the triple product can be expanded in terms of 
b,c, as given by the following identity 


ax (bx c) =(a-c)b—(a-b)c (B.24) 


Workshop exercises 


1. Partition the following exercises among the group. Once you have completed your problem, check with a 
classmate before writing it on the board. After you have verified that you have found the correct solution, 
write your answer in the space provided on the board, taking care to include the steps that you used to arrive 
at your solution. The following information is needed. 


a = 3i + 2j — 9k b = —2i + 3k c = —21+j—6k d=i+9j+4k 

2 7 —4 3 4 2 —4 -8 -1 -3 

E= 3 1 -2 P=(§ G= 7 1 H=| -4 2 -2 
—2 0 5 -1 1 -1 0 0 

Calculate each of the following 

1 la— (b + 3c)| 7 (Em 

2 Component of c along a 8 [HE] 

3 Angle between c and d 9 EHG 

4 (bxd)-a 10 EG-—HG 

5 (bx d) x a 11 EH-HE7 

6 bx (d x a) 12 Fr! 

Problems 


[1] For what values of a are the vectors A = 2a? — 27+ ak and B = aî + 2aĵ + 2k perpendicular? 


[2] Show that the triple scalar product (A x B) - C can be written as 


A, Az Az 
(Ax B):-C=| Bı B2 Bs 
Ci Cy Cy 


Show also that the product is unaffected by interchange of the scalar and vector product operations or by change in 
the order of A, B,C as long as they are in cyclic order, that is 
(A x B)-C=A-(Bx C)=B.-(Cx A) =(C x A)-B 


Therefore we may use the notation ABC to denote the triple scalar product. Finally give a geometric interpre- 
tation of ABC by computing the volume of the parallelepiped defined by the three vectors A, B,C. 


Appendix C 


Orthogonal coordinate systems 


The methods of vector analysis provide a convenient representation of physical laws. However, the manip- 
ulation of scalar and vector fields is greatly facilitated by use of components with respect to an orthogonal 
coordinate system. 


C.1 Cartesian coordinates (x,y,z) 


Cartesian coordinates (rectangular) provide the simplest orthogonal rectangular coordinate system. The 
unit vectors specifying the direction along the three orthogonal axes are taken to be (i,j,k). In cartesian 
coordinates scalar and vector functions are written as 


p = 9(x,y,2) (C.1) 
r = wityjtck (C.2) 


Calculation of the time derivatives of the position vector is especially simple using cartesian coordinates 
because the unit vectors (i,j,k) are constant and independent in time. That is; 


di_ dj dk a 
d d dt ~ 


Since the time derivatives of the unit vectors are all zero then the velocity f =£ reduces to the partial time 


derivatives of x,y, and z. That is, 

t =ci+yj+zk (C.3) 
Similarly the acceleration is given by 

Y =2i4+yj+2%k (C.4) 


C.2 Curvilinear coordinate systems 


There are many examples in physics where the symmetry of the problem makes it more convenient to solve 
motion at a point P(x,y,z) using non-cartesian curvilinear coordinate systems. For example, problems 
having spherical symmetry are most conveniently handled using a spherical coordinate system (r,0,¢) 
with the origin at the center of spherical symmetry. Such problems occur frequently in electrostatics and 
gravitation; e.g. solutions of the atom, or planetary systems. Note that a cartesian coordinate system still 
is required to define the origin plus the polar and azimuthal angles 6,¢. Using spherical coordinates for 
a spherically symmetry system allows the problem to be factored into a cyclic angular part, the solution 
which involves spherical harmonics that are common to all such spherically-symmetric problems, plus a 
one-dimensional radial part that contains the specifics of the particular spherically-symmetric potential. 
Similarly, for problems involving cylindrical symmetry, it is much more convenient to use a cylindrical 
coordinate system (p,¢,z). Again it is necessary to use a cartesian coordinate system to define the origin 
and angle ¢. Motion in a plane can be handled using two dimensional polar coordinates. 
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Curvilinear coordinate systems introduce a complication in that the unit vectors are time dependent in 
contrast to cartesian coordinate system where the unit vectors (i,j, k) are independent and constant in time. 
The introduction of this time dependence warrants further discussion. 

Each of the three axes q; in curvilinear coordinate systems can be expressed in cartesian coordinates 
(x,y,z) as surfaces of constant q; given by the function 


qi = fi (u,y,z) (C.5) 


where ¿ = 1,2, or 3. An element of length ds; perpendicular to the surface q; is the distance between the 
surfaces q; and q; + dq; which can be expressed as 


where h; is a function of (q1,q2,q3). In cartesian coordinates h1,h2, and hg are all unity. The unit-length 
vectors (1, G2, 43, are perpendicular to the respective q1, q2, q3 surfaces, and are oriented to have increasing 
indices such that qi <Q = 43. The correspondence of the curvilinear coordinates, unit vectors, and transform 
coefficients to cartesian, polar, cylindrical and spherical coordinates is given in table C.1. 


Curvilinear | q | q | q3 | Gi | Go | G3 | ha | he | As 
Cartesian Y 2 7 1 |1 
Polar 0 f 0 1 |r 
Cylindrical yp ÔÊ IÈ 1 |p |i 
Spherical r i0 |o |# |O | |1 |r | rsine 
Table C.1: Curvilinear coordinates 
The differential distance and volume elements are given by 
ds =  ds¡Q1 + ds2Q2 + ds3q3 = hidqiqi + hadq292 + h3dq343 (C.7) 
dr =  dsids2dsz = hyhgh3(dqidqzdqs) (C.8) 


These are evaluated below for polar, cylindrical, and spherical coordinates. 


C.2.1 Two-dimensional polar coordinates (r, 0) 


The complication and implications of time-dependent unit vectors are best illustrated by considering two- 
dimensional polar coordinates which is the simplest curvilinear coordinate system. Polar coordinates are a 
special case of cylindrical coordinates, when z is held fixed, or a special case of spherical coordinate system, 
when ¢ is held fixed. 

Consider the motion of a point P as it moves along a curve s(t) such that in the time interval dt it moves 
from P“ to PG) as shown in figure C.2. The two-dimensional polar coordinates have unit vectors É, 6, 
which are orthogonal and change from fj, 61, to fa, ô», in the time dt. Note that for these polar coordinates 
the angle unit vector Ô is taken to be tangential to the rotation since this is the direction of motion of a 
point on the circumference at radius r. 

The net changes shown in figure of table C.2 are 


dê = ĉo — #, = dí = |#| d00 =d00 (C.9) 


since the unit vector f is a constant with || = 1. Note that the infinitessimal dí is perpendicular to the unit 
vector f, that is, dí points in the tangential direction ô. 
Similarly, the infinitessimal 
dô = 0, — 6, = d6 = —do? (C.10) 


which is perpendicular to the tangential 6 unit vector and therefore points in the direction —f . The minus 
sign causes —d0f to be directed in the opposite direction to f. 
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The net distance element ds is given by 
ds =dré + rd# =dré + rd00 (C.11) 


This agrees with the prediction obtained using table C.1. 
The time derivatives of the unit vectors are given by equations (C.9) and (C.10) to be, 


dt dé . 
dô do . 


Note that the time derivatives of unit vectors are perpendicular to the corresponding unit vector, and the 
unit vectors are coupled. 
Consider that the velocity v is expressed as 


dr od dr na 
A A y _ dr, dt T 14 
=, al!) f+r ré +700 (C.14) 


The velocity is resolved into a radial component 7 and an angular, transverse, component rå. 
Similarly the acceleration is given by 


dv di. di dr.» db, .d0 
T ete T + oe Ti 0+r0 Ti 


= (i 2 rá”) P+ (rö + 250) 6 (C.15) 


where the roe term is the effective centripetal acceleration while the 2700 term is called the Coriolis term. 
For the case when 7 = # = 0, then the first bracket in C.15 is the centripetal acceleration while the second 
bracket is the tangential acceleration. 

This discussion has shown that in contrast to the time independence of the cartesian unit basis vectors, 
the unit basis vectors for curvilinear coordinates are time dependent which leads to components of the velocity 
and acceleration involving coupled coordinates. 


Coordinates r,0 s(t) 
Area element da = rdrd0 
Unit vectors f = îcos 0 + jsind 

6 = -isin 0 + ĵcos 0 


Time derivatives E = 00 
of unit vectors da = -0f 
Velocity v=rřf+r00 
Kinetic energy 5 Pr? 


Acceleration 


do 


Table C.2: Differential relations plus a diagram of the unit vectors for 2-dimensional polar coordinates. 
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C.2.2 Cylindrical Coordinates (p, ¢, z) 


The three-dimensional cylindrical coordinates (p, ¢, z) are obtained by adding the motion along the symmetry 
axis Z to the case for polar coordinates. The unit basis vectors are shown in Table C.3 where the angular 
unit vector $ is taken to be tangential corresponding to the direction a point on the circumference would 
move. The distance and volume elements, the cartesian coordinate components of the cylindrical unit 
basis vectors, and the unit vector time derivatives are shown in Table C.3. The time dependence of the 
unit vectors is used to derive the acceleration. As for the two-dimensional polar coordinates, the ò and 6 
direction components of the acceleration for cylindrical coordinates are coupled functions of p, p, p, d, and ó. 


Coordinates p , 2 
Distance element | ds = dpp + pdo + dzz 
Volume element du = pdpdddz 
Unit vectors p=icosdt jsing i 
o = —îsin ọ + ĵcos ġ 
Z=k i 
Time derivatives T = ep | 
of unit vectors ce = —QP k 
dz Cr 
dq = a 
Velocity v = bÊ + poo + 2% o a 
Kinetic energy = PHPP +2? E 
Acceleration a= (i — pọ ) P PL mA 
+ (pb +2p0) $ +32 A i 


Table C.3: Differential relations plus a diagram of the unit vectors for cylindrical coordinates. 


C.2.3 Spherical Coordinates (r, 6, ¢) 


The three dimensional spherical coordinates, can be treated the same way as for cylindrical coordinates. The 
unit basis vectors are shown in Table C.4 where the angular unit vectors 0 and q are taken to be tangential 
corresponding to the direction a point on the circumference moves for a positive rotation angle. 


Coordinates r,0, Q 
Distance element ds = drí + rd00 + r sin Odoo 
Volume element dv = r? sin 0@drdéd¢ 


Unit vectors f = îsin 0 cos ġ + ĵsin 0 sin d + k cos 0 
0 = îcos 0 cos o + ĵcos 0 sin $ — ksin 0 
$ = —ising+jcos¢ 
Time derivatives = 90 + po sin O 
of unit vectors i qa = 40 + od cos 0 
ae = —fósin0 — Odcosé y 
Velocity v=TÍ+r00 + rosin0p 
Kinetic energy m (524120 +r? sin? 06 
7 3 
Acceleration a= (i -rh —ró sin? 0) P 
de ‘ .2 Le 
+ (rö + 250 — rd" sind cosd) 6 


+ (rd sin ð + 27sin 0 + 2r¢cos 9) ó 


Table C.4 Differential relations plus a diagram of the unit vectors for spherical coordinates. 
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The distance and volume elements, the cartesian coordinate components of the spherical unit basis 
vectors, and the unit vector time derivatives are shown in the table given in figure C.4. The time dependence 
of the unit vectors is used to derive the acceleration. As for the case of cylindrical coordinates, the f, 6, and 
$ components of the acceleration involve coupling of the coordinates and their time derivatives. 

It is important to note that the angular unit vectors 6 and @ are taken to be tangential to the circles of 
rotation. However, for discussion of angular velocity of angular momentum it is more convenient to use the 
axes of rotation defined by f x 6 and # x $ for specifying the vector properties which is perpendicular to 
the unit vectors Ó and op. Be careful not to confuse the unit vectors Ó and Q with those used for the angular 
velocities Ó and ¢. 


C.3 Frenet-Serret coordinates 


The cartesian, polar, cylindrical, or spherical curvilinear coordinate systems, all are orthogonal coordinate 
systems that are fixed in space. There are situations where it is more convenient to use the Frenet-Serret 
coordinates which comprise an orthogonal coordinate system that is fixed to the particle that is moving 
along a continuous, differentiable, trajectory in three-dimensional Euclidean space. Let s(t) represent a 
monotonically increasing arc-length along the trajectory of the particle motion as a function of time t. The 
Frenet-Serret coordinates, shown in figure C.5, are the three instantaneous orthogonal unit vectors €, ñ, and 
b where the tangent unit vector Ê is the instantaneous tangent to the curve, the normal unit vector fi is in 
the plane of curvature of the trajectory pointing towards the center of the instantaneous radius of curvature 
and is perpendicular to the tangent unit vector t, while the binormal unit vector is b =t x ñ which is the 
perpendicular to the plane of curvature and is mutually perpendicular to the other two Frenet-Serrat unit 
vectors. The Frenet-Serret unit vectors are defined by the relations 


= = kñ (C.16) 
db g 

ae = -Tn (C.17) 
dû E. 

g F —Kt+Tb (C.18) 


The curvature k = z where p is the radius of curvature and 7 is the torsion that can be either positive 
or negative. For increasing s, a non-zero curvature « implies that the triad of unit vectors rotate in a 
right-handed sense about b. If the torsion 7 is positive (negative) the triad of unit vectors rotates in right 
(left) handed sense about f. 


Distance element 


Unit vectors 


Time derivatives 


t 0 nh 0 ê 
of unit vectors | 4| â | =|| -s 0 7 ñ 
b 0 =r 0 b 


Velocity 
Acceleration 


Table C.5. The differential relations plus a diagram of the corresponding unit vectors for the Frenet-Serret 
coordinate system. 


524 APPENDIX C. ORTHOGONAL COORDINATE SYSTEMS 


The above equations also can be rewritten in the form using a new unit rotation vector w where 
w=Tt+Kb (C.19) 


Then equations C.16 — C.18 are transformed to 


LEEN (C.20) 
ds 

Nr ee ane (C.21) 
ds 

db z 


In general the Frenet-Serret unit vectors are time dependent. If the curvature « = 0 then the curve is a 
straight line and f and Ê are not well defined. If the torsion is zero then the trajectory lies in a plane. Note 
that a helix has constant curvature and constant torsion. 

The rate of change of a general vector field E along the trajectory can be written as 


ds 


dE | dE dEn. dEb? 
ds ds ds 


ñ+ 6) +wxE (0:33) 


The Frenet-Serret coordinates are used in the life sciences to describe the motion of a moving organism 
in a viscous medium. The Frenet-Serret coordinates also have applications to General Relativity. 


Workshop exercises 


1. The goal of this problem is to help you understand the origin of the equations that relate two different coordinate 
systems. Refer to diagrams for cylindrical and spherical coordinates as your teaching assistant explains how to 
arrive at expressions for £1, £2, and 23 in terms of p, d, and z and how to derive expressions for the velocity and 
acceleration vectors in cylindrical coordinates. Now try to relate spherical and rectangular coordinate systems. 
Your group should derive expressions relating the coordinates of the two systems, expressions relating the unit 
vectors and their time derivatives of the two systems, and finally, expressions for the velocity and acceleration 
in spherical coordinates. 


Appendix D 


Coordinate transformations 


Coordinate systems can be translated, or rotated with respect to each other as well as being subject to spatial 
inversion or time reversal. Scalars, vectors, and tensors are defined by their transformation properties under 
rotation, spatial inversion and time reversal, and thus such transformations play a pivotal role in physics. 


D.1 Translational transformations 


Translational transformations are involved frequently for transforming between the center of mass and lab- 
oratory frames for reaction kinematics as well as when performing vector addition of central forces for the 
cases where the centers are displaced. Both the classical Galilean transformation or the relativistic Lorentz 
transformation are handled the same way. Consider two parallel orthonormal coordinate frames where the 
origin of F” (x”, y’, 2”) is displaced by a time dependent vector a(t) from the origin of frame F (x,y,z). Then 
the Galilean transformation for a vector r in frame F to r’ in frame F” is given by 


r (x,y,z) =r (x,y,2) +a(t) (D.1) 


The velocities for a moving frame are given by the vector difference of the velocity in a stationary frame, 
and the velocity of the origin of the moving frame. Linear accelerations can be handled similarly. 


D.2 Rotational transformations 


D.2.1 Rotation matrix 


Rotational transformations of the coordinate system are used extensively in physics. The transformation 
properties of fields under rotation define the scalar and vector properties of fields, as well as rotational 
symmetry and conservation of angular momentum. 

Rotation of the coordinate frame does not change the value of any scalar observable such as mass, 
temperature etc. That is, transformation of a scalar quantity is invariant under coordinate rotation from 
L,y,270',y',2'. 


o(a'y'2") = o(xyz) (D.2) 


By contrast, the components of a vector along the coordinate axes change under rotation of the coordinate 
axes. This difference in transformation properties under rotation between a scalar and a vector is important 
and defines both scalars and a vectors. 

Matrix mechanics, described in appendix A, provides the most convenient way to handle coordinate 
rotations. The transformation matrix, between coordinate systems having differing orientations is called the 
rotation matrix. This transforms the components of any vector with respect to one coordinate frame to 
the components with respect to a second coordinate frame rotated with respect to the first frame. 

Assume a point P has coordinates (x1,22,%3) with respect to a certain coordinate system. Consider 
rotation to another coordinate frame for which the point P has coordinates (24, 74,75) and assume that the 
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origins of both frames coincide. Rotation of a frame does not change the vector, only the vector components 
of the unit basis states. Therefore 


al if a! m! all A A A 
X = 6,11 + €or + 6323 = 6111 + ore + 6303 (D.3) 


Note that if one designates that the unit vectors for the unprimed coordinate frame are (€1, 65,63) and for 
the primed coordinate frame (€, , 82,83), then taking the scalar product of equation D.3 sequentially with 


each of the unit base vectors (êl , 65, 63) leads to the following three relations 


a = (8, -8,)21 7 8 :62)12 + (8 -83)23 (D.4) 
dy = (8) €1)x1 + (€)€7)22 + (69:€3)03 
r3 = (83:€1)x1 + (€3-€7)22 + (63-63) x3 


Note that the (6)-6,) are the direction cosines as defined by the scalar product of two unit vectors for axes 
i,j, that is, they are the cosine of the angle between the two unit vectors. 
Equation D.4 can be written in matrix form as 


x =À x (D.5) 


where the “-” means the inner matrix product of the rotation matrix A and the vector x where 


x 21 8/81 8:67 8:63 

Pi =, =, al A Alon A A 

x=| 2) x=| 2 A=| 5-61 Go 6-4 (D.6) 
ah Za By, yb) 8,6 


The inverse procedure is obtained by multiplying equation D.3 successively by one of the unit basis 
vectors (61, 65,63) leading to three equations 


1) + (€1-€9)a°g + (61-83) 25 (D.7) 
1) + (62-69) ay + (62-83) 25 
4 2 3 


Jay + (63-6 


Tı = (ê 


Ta = (ê2-ê 


x3 = (ês 
Equation D.7 can be written in matrix form as 
x = AT-x' (D.8) 


where A” is the transpose of A. 
Note that substituting equation D.5 into equation D.8 gives 


x=X".(A-x)= (7-1) x (D.9) 


Thus 
(aTa) -=I 


where I is the identity matrix. This implies that the rotation matrix A is orthogonal with ALT=. 
It is convenient to rename the elements of the rotation matrix to be 


so that the rotation matrix is written more compactly as 


Mi Ai A13 
A= | A A22 A23 
Az1 A32 A33 
and equation D.4 becomes 
ey = Ang + A1222 + A1323 (D.11) 
ty = Ati + Aget + 2323 
13 = Agivi + Az222 + A3313 
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Consider an arbitrary rotation through an angle 9. Equations (B.10) and (B.11) can be used to relate 
six of the nine quantities \;; in the rotation matrix, so only three of the quantities are independent. That 
is, because of equation (B.11) we have three equations which ensure that the transformation is unitary. 


Mi + Aig + Aj = 1 (D.12) 
Also requiring that the axes be orthogonal gives three equations 


Y igang = 0, ¡Ak (D.13) 
J 


These six relations can be expressed as 


5 AijAkj = Sik (D.14) 
J 


The fact that the rotation matrix should have three independent quantities is due to the fact that all rotations 
can be expressed in terms of rotations about three orthogonal axes. 


D.1 Example: Rotation matriz: 


Consider a point P(x1, 22,23) = P(3,4,5) in the unprimed coordinate system. Consider the same point 
P(a,25,25) in the primed coordinate system which has been rotated by an angle 60° about the x axis as 
shown. The direction cosines A; ¡=c0s(0;,,) can be determined from the figure to be the following 


bij Ay ¿=cos (0; ¿) 


NO] dO] DO] =| =| =| S; 
wi N| =| w| N| REIS. 


Thus the rotation matrix is 


1. 0 0 
A=| 0 0.500 0.866 
0 —0.866 0.500 


The transform point P'(x',,x5,x4) therefore is given by 


2, Ln 0 0 3 3 
zh |=| 0 0500 0866 |-| 4 | =| 6.330 
wh 0 —0.866 0.500 5 —0.964 


Note that the radial coordinate rp= r'p=v 50. That is, the rotational transformation is unitary and thus 
the magnitude of the vector is unchanged. 
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D.2 Example: Proof that a rotation matriz is orthogonal 


Consider the rotation matrix 


The product 


a 4 1 8 4 
AT A= 7 4 -4l.[1 4 8 |==| 0 81 0 ]=1 
48 1 8 


which implies that A is orthogonal. 


D.2.2 Finite rotations 


Consider two finite 90° rotations A4 and 

Ag illustrated in figure D.1. The Ax ro- 

tation is 90° around the x3 axis in a 

right-handed direction as shown. In such Sa x 

a rotation the axes transform to 11, = 22, — —— 
Y) =—21, 24 = 23 and the rotation matrix ©, = 90° 0, = 90° 
is 


1 0 
Mea -1 0 0 (D.15) 
0 01 


The second rotation Ag is a right-handed 
rotation about the x/ axis which formerly 
was the zə axis. Then 1, = £h, 25 = —2}, 


” . . . A 
£3 = 1) and the rotation matrix is IAS Ss 
o, =% TV 6,=90° 
1 (D.16) 


] : Figure D.1: Order of two finite rotations for a parallelepiped. 
Consider the product of these two finite ro- 


tations which corresponds to a single rota- 
tion matrix AAB 


AAB =ABAA (D.17) 
That is: 
1 0 0 0 1 0 0O 1 0 
AAB = 0 0 1 —1 0 0 = 0 0 1 (D.18) 
0 -1 0 0 0 1 1 0 0 


Now consider that the order of these two rotations is reversed. 
ABA =AAAB (D.19) 


That is: 
0 


1 0 1 
ABa=| -1 0 0 0 
0 0 1 0 


0 0 
0 1 j=] -—1 
-1 0 


1 
0 | Awe (D.20) 
0 -10 


An entirely different orientation results as illustrated in figure D.1. 

This behavior of finite rotations is a consequence of the fact that finite rotations do not commute, that 
is, reversing the order does not give the same answer. Thus, if we associate the vectors A and B with 
these rotations, then it implies that the vector product AB 4 BA. That is, for finite rotation matrices, the 
product does not behave like for true vectors since they do not commute. 
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D.2.3 Infinitessimal rotations 


Infinitessimal rotations do not suffer from the noncommutation defect 
of finite rotations. If the position vector of a point changes from r to 
r + ôr then the geometrical situation is represented correctly by 


ór=00 xr (D.21) 


where 00 is a quantity whose magnitude is equal to the infinitessimal 
rotation angle and which has a direction along the instantaneous axis 
of rotation as illustrated in figure D.2. 

The infinitessimal angle 90 is a vector which is shown by proving 
that two infinitessimal rotations 90, and 682 commute. The change 
in position vectors of the point are 


ory = 60, xr (D.22) 


and 
Org = 002 x (r + ory) (D.23) 


Thus the final position vector for 60; followed by 602 is 


r+ or, + Org =r + 060; x r + 00 x (r+ ori) (D.24) 
Assuming that the second-order infinitessimals can be ignored gives 
Figure D.2: Infinitessimal rotation 
r+o0r, + 6rg =r +801 xr+603 xr (D.25) 
Consider now the inverse order of rotations. 
r+ org + Ory =r +009 x r +00, x (r+ ôro) (D.26) 
Again, neglecting the second-order infinitessimals gives 


r+ org +ôrı =r+002xr+00, xr (D.27) 


Note that the products of these two infinitessimal rotations, D25 and D27 are identical. That is, assuming 
that second-order infinitessimals can be neglected, then the infinitessimal rotations commute, and thus 60, 
and 96, are correctly represented by vectors. 

The fact that 90 is a vector allows angular velocity to be represented by a vector. That is, angular 
velocity is the ratio of an infinitessimal rotation to an infinitessimal time. 


60 
= D.28 
its (D.28) 
Note that this implies that the velocity of the point can be expressed as 
0 


D.2.4 Proper and improper rotations 


The requirement that the coordinate axes be orthogonal, and that the transformation be unitary, leads to 
the relation between the components of the rotation matrix. 


So ig Any = dir (D.30) 
J 


It was shown in equation A.12 that, for such an orthogonal matrix, the inverse matrix At equals the 


transposed matrix S 
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Inserting the orthogonality relation for the rotation matrix leads to the fact that the square of the determinant 
of the rotation matrix equals one, 
AJÍ =1 (D.31) 


that is 
A =41 (D.32) 


A proper rotation is the rotation of a normal vector and has 
A] = +1 (D.33) 


An improper rotation corresponds to 


A] =-1 (D.34) 


An improper rotation implies a rotation plus a spatial reflection which cannot be achieved by any combination 
of only rotations. 

Consider the cross product of two vectors c =a x b. It can be shown that the cross product behaves 
under rotation as: 


cy = [Al 0 aes 2) 
J 


For all proper rotations the determinant of A= +1 and thus the cross product also acts like a proper vector 
under rotation. This is not true for improper rotations where |A| = —1. 


D.3 Spatial inversion transformation 


Spatial inversion, that is, mirror reflection, corresponds to reflection of all coordinate vectors, i = — i, j = — 
j, and k = — k. Such a transformation corresponds to the transformation matrix 


Sa 0 0 10 0 
A=({0 -10 |=-[01 0 (D.36) 
0 0 -1 001 


Thus |A| = —1, that is, it corresponds to an improper 
rotation. A spatial inversion for two vectors A(r) and 
B(r) correspond to x; 


A(r) = —A(-r) (D.37) 
Br) = -B(r) ar 


That is, normal polar vectors change sign under spa- x 

tial reflection. However, the cross product C = A x B 

does not change sign under spatial inversion since the x; 
product of the two minus signs is positive. That is, 


C(r) = +C(—r) (D.38) 
Figure D.3: Inversion of an object corresponds to 
Thus the cross product behaves differently from a polar reflection about the origin of all axes. 
vector. This improper behavior is characteristic of an 
axial vector, which also is called a pseudovector. 
Examples of pseudovectors are angular momentum, spin, magnetic field etc. These pseudovectors are 
defined using the right-hand rule and thus have handedness. For a right-handed system 


Cr=AxB (D.39) 
Changing to a left-handed system leads to 


C¿=BxA=-AxB (D.40) 
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That is, handedness corresponds to a definite ordering of the cross product. Proper orthogonal transforma- 
tions are said to preserve chirality (Greek for handedness) of a coordinate system. 
An example of the use of the right-handed system is the usual definition of cartesian unit vectors, 


ixj=k (D.41) 


An obvious question to be asked, is the handedness of a coordinate system merely a mathematical curiosity 
or does it have some deep underlying significance? Consider the Lorentz force 


F=q(E+vxB) (D.42) 


Since force and velocity are proper vectors then the magnetic B field must be a pseudo vector. Note that 
calculation of the B field occurs only in cross products such as, 


VxB= pj (D.43) 


where the current density j is a proper vector. Another example is the Biot-Savart Law which expresses B 
as 
pol dl xr 


dB = à 


D.44 
Ar r ( ) 


Thus even though B is a pseudo vector, the force F remains a proper vector. Thus if a left-handed coordinate 
definition of By = Lol rxdl is used in D.44, and F = q (E + Bz xv) in D.42, then the same final physical 
result would be obtained. 

It was long thought that the laws of physics were symmetric with respect to spatial inversion ( i.e. mirror 
reflection), meaning that the choice between a left-handed and right-handed representations (chirality) was 
arbitrary. This is true for gravitational, electromagnetic and the strong force, and is called the conservation 
of parity. The fourth fundamental force in nature, the weak force, violates parity and favours handedness. 
It turns out that right-handed ordinary matter is symmetrical with left-handed antimatter. 

In addition to the two flavours of vectors, one has scalars and pseudoscalars defined by: 


pr) = +9(=1) (D.45) 
$r) = —¢(-1r) (D.46) 


An example of a pseudoscalar is the scalar product A - (B x C) 


D.4 Time reversal transformation 


The basic laws of classical mechanics are invariant to the sense of the direction of time. Under time reversal 
the vector r is unchanged while both momentum p and time t change sign under time reversal, thus the time 
derivative F =2 is invariant to time reversal; that is, the force is unchanged and Newton's Laws F = gp 
are invariant under time reversal. Since the force can be expressed as the gradient of a scalar potential for 
a conservative field, then the potential also remains unchanged. That is 

d 

P __vU(r) =F (D.47) 

dt 

It is necessary to introduce tensor algebra, given in appendix EF, prior to discussion of the transformation 

properties of observables which is the topic of appendix E5. 


Workshop exercises 


1. Suppose the x-axis of a rectangular coordinate system is rotated by 30° away from the x3-axis around the 
X1-axis. 


(a) Find the corresponding transformation matrix. Try to do this by drawing a diagram instead of going to 
the book or the notes for a formula. 
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(b) Is this an orthogonal matrix? If so, show that it satisfies the main properties of an orthogonal matrix. If 
not, explain why it fails to be orthogonal. 


(c) Does this matrix represent a proper or an improper rotation? How do you know? 


2. When you were first introduced to vectors, you most likely were told that a scalar is a quantity that is defined 


by a magnitude, while a vector has both a magnitude and a direction. While this is certainly true, there is 
another, more sophisticated way to define a scalar quantity and a vector quantity: through their transformation 
properties. A scalar quantity transforms as on = @ while a vector quantity transforms as Al. = > Vij Aj. To 
show that the scalar product does indeed transform as a scalar, note that: 


A'B = AB => Sg Aj Soin Bu =% XO digi AjBr 
i j k 


i j,k i 


XO (3 AjBe | =)_A;B;=A-B 
k j 


j 


Now you will show that the vector product transforms as a vector. Begin by writing out what you are trying 
to show explicitly and show it to the teaching assistant. Once the teaching assistant has confirmed that you 
have the correct expression, try to prove it. The vector product is a bit more difficult to work with than the 
scalar product, so your teaching assistant is prepared to give you a hint if you get stuck. 


Suppose you have two rectangular coordinate systems that share a common origin, but one system is rotated 
by an angle O with respect to the other. To describe this rotation, you have made use of the rotation matrix 
A(0). (I’m changing the notation slightly to put the emphasis on the angle of rotation.) 


(a) Verify that the product of two rotation matrices A(0,)A(02) is in itself a rotation matrix. 


(b) In abstract algebra, a group G is defined as a set of elements g together with a binary operation * acting 
on that set such that four properties are satisfied: 


i. (Closure) For any two elements g; and g; in the group G, the product of the elements, g; * gj is also 
in the group G. 
ii. (Associativity) For any three elements gi, gj, gk of the group G, (gi * gj) * gk = gi * (Gj * gk). 
iii. (Existence of Identity) The group G contains an identity element e such that g * e = e x g = g for 


all g EG. 
iv. (Existence of Inverses) For each element g € G, there exists an inverse element g~! € G such that 
gg '=g *g=e. 


Show that if the product * denotes the product of two matrices, then the set of rotation matrices together 
with * forms a group. This group is known as the special orthogonal group in two dimensions, also known 


as SO(2). 


(c) Is this group commutative? In abstract algebra, a commutative group is called an abelian group. 


4. When you look in a mirror the image of you appears left-to-right reversed, that is, the image of your left ear 


appears to be the right ear of the image and vise versa. Explain why the image is left-right reversed rather 
than up-down reversed or reversed about some other axis; i.e. explain what breaks the symmetry that leads to 
these properties of the mirror image. 


Problems 


in e transformation Matrix at rotates e axis T3 Ol a rectangular coordinate system oward 1 aroun 
1] Find the transformati trix that rotates the axis 13 of tangul dinate system 45° toward d 


the La axis. 


[2] For simplicity, take À to be a two-dimensional transformation matrix. Show by direct expansion that ¡AJÍ =1. 


Appendix E 


Tensor algebra 


E.1 Tensors 


Mathematically scalars and vectors are the first two members of a hierarchy of entities, called tensors, 
that behave under coordinate transformations as described in appendix D. The use of the tensor notation 
provides a compact and elegant way to handle transformations in physics. 

A scalar is a rank 0 tensor with one component, that is invariant under change of the coordinate system. 


o(a'y'2") = o(xyz) (E.1) 


A vector is a rank 1 tensor which has three components, that transform under rotation according to 
matrix relation 
x =À) x (E.2) 


where A is the rotation matrix. Equation E2 can be written in the suffix form as 


3 
r, = 5 AijTj (E.3) 
j=1 
The above definitions of scalars and vectors can be subsumed into a class of entities called tensors of rank n 
that have 3” components. A scalar is a tensor of rank r = 0, with only 3° = 1 component, whereas a vector 
has rank r = 1, that is, the vector x has one suffix 7 and 3! = 3 components. 
A second-order tensor T;; has rank r = 2 with two suffixes, that is, it has 3? = 9 components that 


transform under rotation as mae 
Th; = Y AMA Ta (E.4) 
k=1 l=1 
For second-order tensors, the transformation formula given by equation 4.4 can be written more compactly 
using matrices. Thus the second-order tensor can be written as a 3 x 3 matrix 


Tı 1 Tia Tis 
T= Tai Tos To (E.5) 
Ls T32 T33 


The rotational transformation given in equation 4.4 can be written in the form 


3 3 3 3 
Tij = y st) Aji = 5 ( st) AG (E.6) 
k=1 k=1 


l=1 = l=1 = 


where NG are the matrix elements of the transposed matrix AT. The summations in E.6 can be expressed 
in both the tensor and conventional matrix form as the matrix product 


T’=rX-T-d7 (E.7) 


Equation E77 defines the rotational properties of a spherical tensor. 


533 


534 APPENDIX E. TENSOR ALGEBRA 


E.2 Tensor products 


E.2.1 Tensor outer product 


Tensor products feature prominently when using tensors to represent transformations. A second-order tensor 
T can be formed by using the tensor product, also called outer product, of two vectors a and b which, 
written in suffix form, is 
a,b, a,b 01b3 
T=a 6) b = ab, Ab» agb3 (E.8) 
agb; agb a3b3 


In component form the matrix elements of this matrix are given by 
da = aibj (E.9) 


This second-order tensor product has a rank r = 2, that is, it equals the sum of the ranks of the two 
vectors. Equation £8 is called a dyad since it was derived by taking the dyadic product of two vectors. In 
general, multiplication, or division, of two vectors leads to second-order tensors. Note that this second-order 
tensor product completes the triad of tensors possible taking the product of two vectors. That is, the scalar 
product a - b, has rank r = 0, the vector product a x b, rank r = 1 and the tensor product a Y b has rank! 
r=2, 

Higher-order tensors can be created by taking more complicated tensor products. For example, a rank-3 
tensor can be created by taking the tensor outer product of the rank-2 tensor T;; and a vector cz which, for 
a dyadic tensor, can be written as the tensor product of three vectors. That is, 


Tijk = Tiger = aids cr (E.10) 


In summary, the rank of the tensor product equals the sum of the ranks of the tensors included in the tensor 
product. 


E.2.2 Tensor inner product 


The lowest rank tensor product, which is called the inner product, is obtained by taking the tensor product 
of two tensors for the special case where one index is repeated, and taking the sum over this repeated index. 
Summing over this repeated index, which is called contraction, removes the two indices for which the index 
is repeated, resulting in a tensor that has rank r equal to the sum of the ranks minus 2 for one contraction. 
That is, the product tensor has rank r = rı + 712— 2. 

The simplest example is the inner product of two vectors which has rank r = 1+ 1 — 2 = 0, that is, it is 
the scalar product that equals the trace of the inner product matrix, and this inner product is commutative. 

An especially important case is the inner product of a rank-2 dyad a ® b, given by equation £8, with a 
vector c, that is, the inner product T =a & b-c. Written in component form, the inner product is 


3 3 
5 aibiCj = > an) Cj = (a 4 b) Cj (E.11) 


The scalar product a - b is a scalar number, and thus the inner-product tensor is the vector c renormalized 
by the magnitude of the scalar product a - b. That is, it has a rank r = 2+1—2= 1. Thus the inner product 
of this rank-2 tensor with a vector gives a vector. The inner product of a rank-2 tensor with a rank-1 tensor 
is used in this book for handling the rotation matrix, the inertia tensor for rigid-body rotation, and for the 
stress and the strain tensors used to describe elasticity in solids. 


E.1 Example: Displacement gradient tensor 


The displacement gradient tensor provides an example of the use of the matrix representation to manipu- 
late tensors. Let p(x1,x2,t3) be a vector field expressed in a cartesian basis. The definition of the gradient 
G=V¢ gives that 

dp = G-dx 


!'The common convention is to denote the scalar product as a - b, the vector product as a x b, and tensor product as a Y b. 


E.3. TENSOR PROPERTIES 535 


Calculating the components of dd in terms of x gives 


_ 0%, | Ob, pı 
dé, = e Ena + Bn 

_ 00, Ob» Ob» 
do2 = a =P Bint + da 

_ 0%. _ 3$ 003 
dos = PT M m + aa 

Using index notation this can be written as 
06; 
dé; = de dx; 


G=| 242 26, de 


Then the vector @ can be expressed compactly as the inner product of G and x,that is 


do = G-dx 


E.3 Tensor properties 


In principle one must distinguish between a 3 x 3 square matrix, and the tensor component representations of 
arank-2 tensor. However, as illustrated by the previous discussion, for orthogonal transformations, the tensor 
components of the second rank tensor transform identically with the matrix components. Thus functionally, 
the matrix formulation and tensor representations are identical. As a consequence, all the terminology and 
operations used in matrix mechanics are equally applicable to the tensor representation. 

The tensor representation of the rotation matrix provides the simplest example of the equivalence of 
the matrix and tensor representations of transformations. Appendix D.2 showed that the unitary rotation 
matrix A, acting on a vector x transforms it to the vector x’ that is rotated with respect to x. That is, the 
transformation is 


x =À x (D5) 
where 
x 21 88, Sé, 8/8 
E 7 _ = AA Ap om ap oR 
x=| 2) x=| 2 A=| 65-461 €, é2 65-4 (D6) 
T3 23 65-6, 65-6) 65-63 


Appendix D.2 showed that the rotation matrix A requires 9 components to fully specify the transformation 
from the initial 3-component vector x to the rotated vector x’. The rotation tensor is a dyad as well as being 
unitary and dimensionless. Note that equation D5 is an example of the inner product of a rank—2 rotation 
tensor acting on a vector leading to a another vector that is rotated with respect to the first vector. 

In general, rank-2 tensors have dimensions and are not unitary. For example, the angular velocity vector 
w and the angular momentum vector L are related by the inner product of the inertia tensor {I} and w. 
That is 

L={I}-w (11.6) 

The inertia tensor has dimensions of mass x length? and relates two very different vector observables. The 
stress tensor and the strain tensor, discussed in chapter 15, provide another example of second-order tensors 
that are used to transform one vector observable to another vector observable analogous to the case of the 
rotation matrix or the inertia tensor. 

Note that pseudo-tensors can be used to make a rotational transformation plus a change in the sign. 
That is, they lead to a parity inversion. 

The tensor notation is used extensively in physics since it provides a powerful, elegant, and compact 
representation for describing transformations. 
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E.4 Contravariant and covariant tensors 


In general the configuration space used to specify a dynamical system is not a Euclidean space in that 
there may not be a system of coordinates for which the distance between any two neighboring points can 
be represented by the sum of the squares of the coordinate differentials. For example, a set of cartesian 
coordinate does not exist for the two-dimension motion of a single particle constrained to the curved surface 
of a fixed sphere. Such curved spaces need to be represented in terms of Riemannian geometry rather 
than Euclidean geometry. Curved configuration spaces occur in some branches of physics such as Einstein’s 
General Theory of Relativity. 

Tensors have transformation properties that can be either contravariant or covariant. Consider a set of 
generalized coordinates q” that are a function of the coordinates q. Then infinitessimal changes dq”” will lead 
to infinitessimal changes dq’" where 


dd” a að gr 


= dq” E.12 
2 agin q (E.12) 


Contravariant components of a tensor transform according to the relation 


m oq” m 
=> rA (E.13) 


Equation £13 relates the contravariant components in the unprimed and primed frames. 
Derivatives of a scalar function ¢, such as 


E N E O I 
An = F ys Bite OF = 2s aq" (E.14) 


That is, covariant components of the tensor transform according to the relation 


La oq" ym (E.15) 


It is important to differentiate between contravariant and covariant vectors. The Einstein superscript /subscript 
convention for distinguishing between these two flavours of tensors is given in table El 


Table E.1. Einstein notation for tensors. 


x” | denotes a contravariant vector 
x, | denotes a covariant vector 


In linear algebra one can map from one coordinate system to another as illustrated in appendix D. That 
is, the tensor x can be expressed as components with respect to either the unprimed or primed coordinate 
frames 


For a n—dimensional manifold the unit basis column vectors é transform according to the transformation 
matrix À 
ê= AA ê (E.17) 


Since the tensor x is independent of the coordinate basis, the components of x must have the opposite 
transform 3 
x’ =(A7') -x (E.18) 


This normal vector x is called a “contravariant vector” because it transforms contrary to the basis column 
vector transformation. 
The inverse of equation F.18 gives that the column vector element 


f= Y Awe (E.19) 
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Consider the case of a gradient with respect to the coordinate x in both the unprimed and primed bases. 
Using the chain rule for the partial derivative then the component of the gradient in the primed frame can 
be expanded as 


= = es: = pd v v = o 2 
(VS), 0x7, - Oxy Ou, > Oxy A mr App 02, (E 0) 
That is, the gradient transforms as 
V'f=A Vi (E.21) 


That is, a gradient transforms as a covariant vector, like the unit vectors, whereas a vector x is contravariant 
under transformation. 

Normally the basis is orthonormal, A = A, and thus there is no difference between contravariant and 
covariant vectors. However, for curved coordinate systems, such as non-Euclidean geometry in the General 
Theory of Relativity, the covariant and contravariant vectors behave differently. 

The Einstein convention is extended to apply to matrices by writing the elements of the matrix A as 
A“, while the elements of the transposed matrix A”! are written as A“. The matrix product for A with a 
contravariant vector X is written as 

X” =Ņ_ Ae (E.22) 
V 
where the summation over v effectively cancels the identical superscript and subscript v. 

Similarly a covariant vector, such as a gradient, is written as, 

(19, = DADO, = DADA, (E.23) 
V V 


Again the summation cancels the v superscript and subscript. The Kronecker delta symbol is written as 


XOX” =X" (E.24) 


E.5 Generalized inner product 


The generalized definition of an inner product is 
S= 5 Juv X"Y” (E.25) 
pu 


where g,,, is a unitary matrix called a covariant metric. The covariant metric transforms a contravariant to 
a covariant tensor. For example the matrix element of a covariant tensor X, can be written as 


X= Y gx" (E.26) 
H 


By association of the covariant metric with either of the vectors in the inner product gives 


SEN gw X"Y” =X XY" =X X"Y, (E.27) 
uv v 7 
Similarly it can be defined in terms of an orthogonal contravariant metric g”” where 
S= 5 g” X,Y, (E.28) 
uv 
Then 
X” = 5 g” X, (E.29) 
173 


Association of the contravariant metric with one of the vectors in the inner product gives the inner 
product 


S= 5 g” X,Y, = 5Y xY, = 5 X,Y" (E.30) 
pu v H 


For most situations in this book the metric g,, is diagonal and unitary. 
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E.6 Transformation properties of observables 


In physics, observables can be represented by spherical tensors which specify the angular momentum and 
parity characteristics of the observable, and the tensor rank is independent of the time dependence. The 
transformation properties of these tensors, coupled with their time-reversal invariance, specify the funda- 
mental characteristics of the observables. 

Table 4.2 summarizes the transformation properties under rotation, spatial inversion and time reversal 
for observables encountered in classical mechanics and electrodynamics. Note that observables can be scalar, 
vector, pseudovector, or second-order tensors, under rotation, and even or odd under either space inversion 
or time inversion. For example, in classical mechanics the inertia tensor I relates the angular velocity vector 
w to the angular momentum vector L by taking the inner product L = I - w. In general I is not diagonal and 
thus the angular momentum is not parallel to the angular velocity w. A similar example in electrodynamics 
is the dielectric tensor K which relates the displacement field D to the electric field E by D=K-E. For 
anisotropic crystal media K is not diagonal leading to the electric field vectors E and D not being parallel. 

As discussed in chapter 7, Noether’s Theorem states that symmetries of the transformation properties lead 
to important conservation laws. The behavior of classical systems under rotation relates to the conservation 
of angular momentum, the behavior under spatial inversion relates to parity conservation, and time-reversal 
invariance relates to conservation of energy. That is, conservative forces conserve energy and are time-reversal 
invariant. 


Table £.2: Transformation properties of scalar, vector, pseudovector, and tensor observables 
under rotation, spatial inversion, and time reversal? 


Physical Observable Rotation Space Time Name 
(Tensor rank) inversion reversal 
1) Classical Mechanics 


Mass density p 0 Even Even Scalar 
Kinetic energy p?/2m 0 Even Even Scalar 
Potential energy U(r) 0 Even Even Scalar 
Lagrangian L 0 Even Even Scalar 
Hamiltonian A 0 Even Even Scalar 
Gravitational potential O 0 Even Even Scalar 
Coordinate r 1 Odd Even Vector 
Velocity v 1 Odd Odd Vector 
Momentum p 1 Odd Odd Vector 
Angular momentum L=rxp 1 Even Odd Pseudovector 
Force F 1 Odd Even Vector 
Torque N=rxF 1 Even Even Pseudovector 
Gravitational field g 1 Odd Even Vector 
Inertia tensor I 2 Even Even Tensor 
Elasticity stress tensor Tik 2 Even Even Tensor 

2) Electromagnetism 

Charge density p 0 Even Even Scalar 
Current density j 1 Odd Odd Vector 
Electric field E 1 Odd Even Vector 
Polarization P 1 Odd Even Vector 
Displacement D 1 Odd Even Vector 
Magnetic B field B 1 Even Odd Pseudovector 
Magnetization M 1 Even Odd Pseudovector 
Magnetic H field H 1 Even Odd Pseudovector 
Poynting vector S=ExH 1 Odd Odd Vector 
Dielectric tensor K 2 Even Even Tensor 
Maxwell stress tensor Tix 2 Even Even Tensor 


“Based on table 6.1 in "Classical Electrodynamics" 27 edition, by J.D. Jackson [?] 


Appendix F 


Aspects of multivariate calculus 


Multivariate calculus provides the framework for handling systems having many variables associated with 
each of several bodies. It is assumed that the reader has studied linear differential equations plus multivariate 
calculus and thus has been exposed to the calculus used in classical mechanics. Chapter 5 of this book 
introduced variational calculus which covers several important aspects of multivariate calculus such as Euler’s 
variational calculus and Lagrange multipliers. This appendix provides a brief review of a selection of other 
aspects of multivariate calculus that feature prominently in classical mechanics. 


F.1 Partial differentiation 


The extension of the derivative to multivariate calculus involves use of partial derivatives. The partial 
derivative with respect to the variable x; of a multivariate function f(%1,x2...., 1 y) involves taking the 
normal one-variable derivative with respect to x; assuming that the other N — 1 variables are held constant. 
That is, 

Of (21,20, En) Pa f(@1, 2, Li 1, (£i + hi), ..un) — f(21,%2,.., EN) 


F.1 
Ox; hi—>0 h; ( ) 


where it will be assumed that the function f(x) is a continuously-differentiable function to n** order, then 
all partial derivatives of that order or less are independent of the order in which they are performed. That 


is, 
Pf) _ f(z) 


The chain rule for partial differentiation gives that 


Of (Y1, Y2,- YN) x) Oxx(y) 
By; -5% ae 5 (F.3) 


The total differential of a multivariate function f(x) is 


This can be extended to higher-order derivatives using the operator formalism 


pia) — Y IN He ) 


F.2 Linear operators 


The linear operator notation provides a powerful, elegant, and compact way to express, and apply, the 
equations of multivariate calculus; it is used extensively in mathematics and physics. The linear operators 
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typically comprise partial derivatives that act on scalar, vector, or tensor fields. Table F1 lists a few 
elementary examples of the use of linear operators in this textbook. The first four linear operators involve 
the widely used del operator V to generate the gradient, divergence and curl as described in appendices G 
and H. The fifth and sixth linear operators act on the Lagrangian in Lagrangian mechanics applications. 
The final two linear operators act on the wavefunction for wave mechanics. 


Name Partial derivative Field Action 
Gradient k Scalar potential V | E= VV 


Divergence : k : Vector field E 


Curl = Vector field E 


Laplacian V= Scalar potential V 
Scalar Lagrangian L 


Scalar Lagrangian L 


Wavefunction Y 


Wavefunction Y 


Table F.1, examples of linear operators used in this textbook. 


There are three ways of expressing operations such as addition, multiplication, transposition or inversion 
of operations that are completely equivalent because they all are based on the same principles of linear 
algebra. For example, a transformation O acting on a vector A can produced the vector B. The simplest 
way to express this transformation is in terms of components 


B; = 
J 


Oij Aj (F.6) 


3 
=1 


Another way is to use matrix mechanics where the 3 x 3 matrix (O) transforms the column vector (A) to 
the column vector (B), that is, 
(B) =(0) (A) (F.7) 


The third approach is to assume an operator O acts on the vector A 
B=OA (F.8) 


In classical mechanics, and quantum mechanics, these three equivalent approaches are used and exploited 
extensively and interchangeably. In particular the rules of matrix manipulation, that are given in appendix 
A, are synonymous, and equivalent to, those that apply for operator manipulation. If the operator is complex 
then the operator properties are summarized as follows. 

The generalization of the transpose for complex operators is the Hermitian conjugate Ot 


ton 


Note also that 
ot = (O*)T = (OT)* (F.10) 


The generalization of a symmetric matrix is Hermitian, that is, O is equal to its Hermitian conjugate 
O}, = O4; = Ojj (F.11) 


For a real matrix the complex conjugation has no effect so the matrix is real and symmetric. 
The generalization of orthogonal is unitary for which the operator is unitary if it is non-singular and 


o~! = 0t (F.12) 


which implies 
oot =U = O0 (F.13) 
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F.3 Transformation Jacobian 


The Jacobian determinant, which is usually called the Jacobian, is used extensively in mechanics for both 
rotational and translational coordinate transformations. The Jacobian determinant is defined as being the 
ratio of the n-dimensional volume element dx,dx3...dx, in one coordinate system, to the volume element 
dy, dy2...dy, in the second coordinate system. That is 


xı xı Ox1 

0 0 O OYn 

Os, Bes Oe 
J a 0x1023...0Ln = Oy1 Oy2 cid OYn F.14 
(Y1Y2---Yn) = Aoi : y : (F.14) 

Y10Y2---OYn : : : : 
Ol, rn 02 y, 
Oy Oya OO" 


F.3.1 Transformation of integrals: 


Consider a coordinate transformation for the integral of the function f(x1,w2,..1,) to the integral of a 
function g(y1, y2,---Yn) where y; = h (£1, £2, ..-£n). The coordinate transformation of the integral equation 
can be expressed in terms of the Jacobian J(y¡Ya...Yn) 


J flere, ty derder dy = J ones odds. = (F.15) 
O21 029...02n, f 
L2 Ln. dyidy2...dYyn = Y2, Yn) J (Y1, Y2, ---Yn Jdy1 dy2...dYn 
[te 2 ROTA Dy, da. dy FY1, Y2, --Yn)J (Y, Y2, ---Yn)dy1 dya...dy 


F.3.2 Transformation of differential equations: 


The differential cross sections for scattering can be defined either by the number of a definite kind of 
particle/per event, going into the volume element in momentum space dp;dp2dp3, or by the number going 
into the solid angle element having momentum between p and p+ dp. That is, the first definition can be 
written as a differential equation 


S(p1(p09), p2(p9¢), p3(pO)) O(p1, pa, p3) 
Op Op20p3 o(p, 9 , 0) 


As shown in table C.4, dp,dpodp3 = p? sin Odpd6d¢@, that is, the Jacobian equals p? sin 9. Thus equation F.16 
can be written as 


o? S(p1, p2, D3) 
Op Op20p3 


dp dp2dp3 = dpdôdo (F.16) 


Os 


08S(p1, p2, p3) 
Op Op20p3 


Op, Op20p3 


0o(p, 0, $) 


dpi dp2dp3 = | pan 


r| (sin 0dpd0d@) = dpdQ (F.17) 


The differential cross section is defined by 


do (p, 9, o) ars p? 
OpdQ. = Op, Op20p3- 


(F.18) 
where the p? factor is absorbed into the cross section and the solid angle term is factored out 


F.3.3 Properties of the Jacobian: 


In classical mechanics the Jacobian often is extended from 3 dimensions to n-dimensional transformations. 
The Jacobian is unity for unitary transformations such as rotations and linear translations which implies that 
the volume element is preserved. It will be shown that this also is true for a certain class of transformations 
in classical mechanics that are called canonical transformations. The Jacobian transforms the local density 
to be correct for any scale transformations such as transforming linear dimensions from centimeters to inches. 


F.1 Example: Jacobian for transform from cartesian to spherical coordinates 


Consider the transform in the three-dimensional integral f f (x1, £2, x3)dxzıdzədzz under transformation 
from cartesian coordinates (£1, £2,£3) to spherical coordinates (r,0,@). The transformation is governed by 
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the geometric relations x; = rsin@ cos ¢, x2 = r sin O sin ġ, 13 = r cos. For this transformation the Jacobian 
determinant equals 


sindcos@ rcosOcos@ -—r sinô sin 
J(r,0,¢) =| sin0singd rcos@sing@ rsinOcos¢d | =r? sind 
cos 0 —rsin@ 0 


Thus the three-dimensional volume integral transforms to 


J flerea,ca)derdendes = | $(7,0,0)(r,0,0)drdode = | £(0,0,0)17 sin 0drdéd¢ 


which is the well-known volume integral in spherical coordinates. 


F.4 Legendre transformation 


Hamiltonian mechanics can be derived directly from Lagrange mechanics by considering the Legendre trans- 
formation between the conjugate variables (q,q,t) and (q,p,t). Such a derivation is of considerable im- 
portance in that it shows that Hamiltonian mechanics is based on the same variational principles as those 
used to derive Lagrangian mechanics; that is d’Alembert’s Principle or Hamilton’s Principle. The general 
problem of converting Lagrange’s equations into the Hamiltonian form hinges on the inversion of equation 
(8.3) that defines the generalized momentum p. This inversion is simplified by the fact that (8.3) is the first 
partial derivative of the Lagrangian L(q,q,t) which is a scalar function. 
Consider transformations between two functions F(u,w) and G(v,w) where u and v are the active 
variables related by the functional form 
v = VuF(u,w) (F.19) 


and where w designates passive variables and V.F(u, w) is the first-order derivative of F(u,w) , i.e. the 
gradient, with respect to the components of the vector u. The Legendre transform states that the inverse 
formula can always be written in the form 


u= V,G(v, w) (F.20) 
where the function G(v, w) is related to F(u, w) by the symmetric relation 
G(v, w) + F(u, w) = u- v (F.21) 


and where the scalar product u -v = Spee: UjVj- 
Furthermore the derivatives with respect to all the passive variables {w;} are related by 


VwF(u, w) =-—V,,G(v, w) (F.22) 


The relationship between the functions F(u, w) and G(v, w) is symmetrical and each is said to be the 
Legendre transform of the other. 


Workshop exercises 


1. Below you will find a set of integrals. Your teaching assistant will divide you into groups and each group will 
be assigned one integral to work on. Once your group has solved the integral, write the solution on the board 
in the space provided by the teaching assistant. 


SE E^ 2 sin 0drdodo 


S(- 5) at 


Sy A- da where A = 27+ yj + zk and S is the sphere 2? +y? +27 =9. 


(a) 

(b) 

(c) 

(d) al V x A)-da where A = yi+zj+ak and S is the surface defined by the paraboloid z = 1 — £? —y?, 
where z > A 


Appendix G 


Vector differential calculus 


This appendix reviews vector differential calculus which is used extensively in both classical mechanics and 
electromagnetism. 


G.1 Scalar differential operators 


G.1.1 Scalar field 


Differential operators like time (4) do not change the rotational properties of scalars or proper vectors. A 


scalar operator £ acting on a scalar field ¢(zyz), in a rotated coordinated frame ġ'(x'y'z’) is unchanged. 
dg’ _ dọ 
— ==> G.1 
ds ds (S 
G.1.2 Vector field 
Similarly for a proper vector field 
dA‘ dA; 
¿= Y Ag 2 
ds A 4 ds (G2) 


That is, differentiation of scalar or vector fields with respect to a scalar operator does not change the 


rotational behavior. In particular, the scalar differentials of vectors continue to obey the rules of ordinary 


proper vectors. The scalar operator 2 is used for calculation of velocity or acceleration. 


G.2 Vector differential operators in cartesian coordinates 


Vector differential operators, such as the gradient operator, are important in physics. The action of vector 
operators differ along different orthogonal axes. 


G.2.1 Scalar field 


Consider a continuous, single-valued scalar function ¢(;,27;,%,). Since 


¢=¢ (G.3) 
then the partial differential with respect to one component x; of the vector x’ gives 
ag 09 Ox; 
a Es G.4 
Ox; 2 Ox; Ox; a) 


The inverse rotation gives that 


k 
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Therefore o Da! 
x; x£ 
= 5 Mei aor =D Arjðik = Aj (G.6) 
i k 1 k 
Thus ad! a6 
27 = 5 Noa, (G.7) 
Y j 


That is the vector derivative acting of a scalar field transforms like a proper vector. 
Define the gradient, or V operator, as 


0 
V= €; — G.8 
Na (63) 
where é; is the unit vector along the x; axis. In cartesian coordinates, the del vector operator is, 


ya neg a 
Oy 


Ox Oz (53) 


The gradient was applied to the gravitational and electrostatic potential to derive the corresponding field. 
For example, for electrostatics it was shown that the gradient of the scalar electrostatic potential field V can 
be written in cartesian coordinates as 

E=-VV (G.10) 


Note that the gradient of a scalar field produces a vector field. You are familiar with this if you are a skier 
in that the gravitational force pulls you down the line of steepest descent for the ski slope. 
G.2.2 Vector field 


Another possible operation for the del operator is the scalar product with a vector. Using the definition of 
a scalar product in cartesian coordinates gives 


(G.11) 


This scalar derivative of a vector field is called the divergence. Note that the scalar product produces a 
scalar field which is invariant to rotation of the coordinate axes. 

The vector product of the del operator with another vector, is called the curl which is used extensively 
in physics. It can be written in the determinant form 


i~ Gk 
VxA=|2 y è (G.12) 
Az Ay A, 


By contrast to the scalar product, both the gradient of a scalar field, and the vector product, are vector 
fields for which the components along the coordinate axes transform in a specific manner, such as to keep the 
length of the vector constant, as the coordinate frame is rotated. The gradient, scalar and vector products 
with the V operator are the first order derivatives of fields that occur most frequently in physics. 

Second derivatives of fields also are used. Let us consider some possible combinations of the product of 
two del operators. 


1) V- (VV) = V?V 
The scalar product of two del operators is a scalar under rotation. Evaluating the scalar product in 
cartesian coordinates gives 


~ O 20 -0 gm EN en 2 2 2 
(i243 ) it mw) vv ev (G.13) 


i +k +k + 
e Oy Oe) Non a el me a Oe 
This also can be obtained without confusion by writing this product as; 


V-(VV)=V.VV=(V.V)V (G.14) 
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where the scalar product of the del operator is a scalar, called the Laplacian VŽ, given by 


92.0%. 6 
_ wv2- 
VW ata tas (G.15) 


The Laplacian operator is encountered frequently in physics. 


2) Vx(VV)=0 
Note that the vector product of two identical vectors 


AxA=0 (G.16) 
Therefore 
Vx(VV)=0 (G.17) 
This can be confirmed by evaluating the separate components along each axis. 


3) V-(V x A) =0 
This is zero because the cross-product is perpendicular to V x A and thus the dot product is zero. 


4) Vx(VxA)=V-(V-A)- VA 
The identity 
Ax(BxC)=B(A-C)-(A-B)C (G.18) 


can be used to give 
Vx(VxA)=V-(V-A)-V?A (G.19) 


since V - V = V’. 
There are pitfalls in the discussion of second derivatives in that it is assumed that both del operators 
operate on the same variable, otherwise the results are different. 


G.3 Vector differential operators in curvilinear coordinates 


As discussed in Appendix C there are many situations where the symmetries make it more convenient to use 
orthogonal curvilinear coordinate systems rather than cartesian coordinates. Thus it is necessary to extend 
vector derivatives from cartesian to curvilinear coordinates. Table C.1 can be used for expressing vector 
derivatives in curvilinear coordinate systems. 


G.3.1 Gradient: 


The gradient in curvilinear coordinates is 


DOF cs LoF = EAF 


Vf=—— ——= — G.20 
f hi ðq qı ha 09 q2 ha dq q3 ( ) 
where the coefficients h; are listed in table C.1. 
For cylindrical coordinates this becomes 
o 10 o 
Vif= fat Lor f3 (G.21) 


dp” poy Oz 


In spherical coordinates 


ai OF e 


VIS a poo) renee” 


(G.22) 
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G.3.2 Divergence: 


The divergence can be expressed as 


V-A 


1 o o o 
==> | — (41ha2h3) + — (A2hgh1) + — (Ashih 
hihoh3 da 1/42 3) Dan | 2143 1) Das | 3/41 2) 


In cylindrical coordinates the divergence is 


p Oy Oz p “Op” p Oy Oz 


10 1047 _ OA, Ap A, 10A, , OA, 
on AR Lage = Pots Po att £ 


In spherical coordinates the divergence is 


1 0 2. 0 . 0 
A CE A ER £ ; A, 
V-A T (A,r? sin 8) + 59 (Apr sin 0) + De (Ayr) 


G.3.3 Curl: 
hi41 h242  h343 
a a a 


Th hoha | . Oat Oq2 Das 
hahaha hy At ho Ag ha A 


VxA 


In cylindrical coordinates the curl is 


A, pAy Az 
In spherical coordinates the curl is 
1 î rô rsin0p 
— ð ð ð 
VAc ango or 8 ap 
A, rpAg rsindAg 


G.3.4 Laplacian: 


Taking the divergence of the gradient of a scalar gives 


b O: 1 a (as af.) a Ez af 
VfaV-Vf= ( + + 
f f hıhzhg E hy ôq 0q \ ha 0q2 093 


The Laplacian of a scalar function f in cylindrical coordinates is 
10 o 1 8? ae 
Vef=-— pee = oF oF 
pop \ dp Pop Az? 
The Laplacian of a scalar function f in spherical coordinates is 


2. 10(30f 1 9(.,0f 1 ef 
Vil ape on) © ands O resin Oe? 


h3 Oq3 


(G.23) 


(G.24) 


(G.25) 


(G.26) 


(G.27) 


(G.28) 


(G.29) 


(G.30) 


(G.31) 


The gradient, divergence, curl and Laplacian are used extensively in curvilinear coordinate systems when 


dealing with vector fields in Newtonian mechanics, electromagnetism, and fluid flow. 


Appendix H 


Vector integral calculus 


Field equations, such as for electromagnetic and gravitational fields, require both line integrals, and surface 
integrals, of vector fields to evaluate potential, flux and circulation. These require use of the gradient, the 
Divergence Theorem and Stokes Theorem which are discussed in the following sections. 


H.1 Line integral of the gradient of a scalar field 
The change AV in a scalar field for an infinitessimal step dl along a path can be written as 
AV =(VV)-dl (H.1) 


since the gradient of V, that is, WV, is the rate of change of V with dl. Discussions of gravitational and 
electrostatic potential show that the line integral between points a and 6 is given in terms of the del operator 
by 


Vi — Va = i (VV)-dl (H.2) 


This relates the difference in values of a scalar field at two points to the line integral of the dot product of 
the gradient with the element of the line integral. 


H.2 Divergence theorem 


H.2.1 Flux of a vector field for Gaussian surface 


Consider the flux ® of a vector field F for a closed surface, usually 
called a Gaussian surface, S shown in figure H.1. 


o= f Fas (H.3) 


If the enclosed volume is cut in to two pieces enclosed by surfaces 
S1 = Sa + Sap and Sy = Sp + Sap. The flux through the surface Sab 
common to both Sı and S2 are equal and in the same direction. Then 
the net flux through the sum of Sı and S2 is given by 


f Faste F-dS= f F-dS (H.4) 
S1 Sa S 


since the contributions of the common surface Sap cancel in that the 
flux out of Sı is equal and opposite to the flux into S2 over the surface 
Sab. That is, independent of how many times the volume enclosed by 
S is subdivided, the net flux for the sum of all the Gaussian surfaces 
enclosing these subdivisions of the volume, still equals $5 F- dS. 


Figure H.1: A volume V enclosed 
by a closed surface S is cut into two 
pieces at the surface Sap. This gives 
Vi enclosed by Sı and Vı enclosed 
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Consider that the volume enclosed by S is subdivided into N subdivisions where N — oo, then even 
though $s, F- dS — 0 as N — œ, the sum over surfaces of all the infinitessimal volumes remains unchanged 


N-oo 


v= frds= Y f Fas (H.5) 


Thus we can take the limit of a sum of an infinite number of infinitessimal volumes as is needed to obtain a 
differential form. The surface integral for each infinitessimal volume will equal zero which is not useful, that 
is $s, F - dS — 0 as N — 00. However, the flux per unit volume has a finite value as N — oo. This ratio is 
called the divergence of the vector field; 


fo, F- dS 


divF = Limar; mo 
Ti 


(H.6) 
where Ar, is the infinitessimal volume enclosed by surface S;. The divergence of the vector field is a scalar 
quantity. 

Thus the sum of flux over all infinitessimal subdivisions of the volume enclosed by a closed surface S 
equals 


ð= fr -dS = >, LEE an, = 2 divF Ar; (H.7) 
In the limit N — oo, At; — 0, this becomes the integral; 
$ = f F - dS = f divF dr (H.8) 
S Enclosed 


volume 


This is called the Divergence Theorem or Gauss’s Theorem. To avoid confusion with Gauss’s law in electro- 
statics, it will be referred to as the Divergence theorem. 


H.2.2 Divergence in cartesian coordinates. 


Consider the special case of an infinitessimal rectangular box, size 
Az, Ay, Az shown in figure H.2. Consider the net flux for the z com- 
ponent F, entering the surface ArAy at location (2, y, z). 


Az OF; z Ay OF, 
2 Ox 2 Oy 


AQ!” = (r + ) AzxAy (H.9) 


The net flux of the z component out of the surface at z + Az is 


Or, E Ax0F,  AyOF, 
Oz 2 Ox 2 Oy 


Age! = (z + Az ) AzrAy (H.10) 


Thus the net flux out of the box due to the z component of F is 


OF, 


AD, = Abe — Agi? = e) =ArAyAz (H.11) 
z 
Adding the similar x and y components for A® gives Figure H.2: Computation of flux 
5R J OF out of an infinitessimal rectangular 
AF = | E 8 = | ArAyAz (H.12) box, Aa, Ay, Az. 
Ox Oy Oz 


This gives that the divergence of the vector field F is 


F.dS 
divF = Lima, o E Pte p Chu: oi 
j Ox Oy Oz 


E (H.13) 
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since Ar = AxvAyAz. But the right hand side of the equation equals the scalar product V - F, that is, 
divF =V-F (H.14) 


The divergence is a scalar quantity. The physical meaning of the divergence is that it gives the net flux per 
unit volume flowing out of an infinitessimal volume. A positive divergence corresponds to a net outflow of 
flux from the infinitessimal volume at any location while a negative divergence implies a net inflow of flux 
to this infinitessimal volume. 

It was shown that for an infinitessimal rectangular box 


OF, OF, | OF. 
ð= x y z 
A (= ae Oy = Oz 


) ArAyAz = V FAT (H.15) 


Integrating over the finite volume enclosed by the surface S gives 


o= $ Fas = f V .Fdr (H.16) 
S Enclosed 


volume 


This is another way of expressing the Divergence theorem 


b= f F . dS = f divFdr (H.17) 
S Enclosed 


volume 


The divergence theorem, developed by Gauss, is of considerable importance, it relates the surface integral of 
a vector field, that is, the outgoing flux, to a volume integral of V - F over the enclosed volume. 


H.1 Example: Mazwell’s Flux Equations 


As an example of the usefulness of this relation, consider the Gauss’s law for the flux in Maxwell’s 
equations. 
Gauss’ Law for the electric field 


1 
g= E. dS =— d 
a Pose E0 lama . 


sur face volume 


But the divergence relation gives that 


n= PE ds= | V - Edr 
S Enclosed 


volume 
Combining these gives 
1 
E-dS= V -Edr = — pdr 
Closed Enclosed Eo Jenclosed 
surface volume volume 


This is true independent of the shape of the surface or enclosed volume, leading to the differential form 
of Maxwell's first law, that is Gauss’s law for the electric field. 


v.E-=2 
€0 
The differential form of Gauss’s law relates V - E to the charge density p at that same location. This is 


much easier to evaluate than a surface and volume integral required using the integral form of Gauss’s law. 
Gauss's law for magnetism 


Pa ose A 


surface 


Using the divergence theorem gives that 


a= B-ds = / V.Bdr=0 
Closed Enclosed 


surface volume 
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This is true independent of the shape of the Gaussian surface leading to the differential form of Gauss’s law 
for B 
V.B=0 


That is, the local value of the divergence of B is zero everywhere. 


H.2 Example: Buoyancy forces in fluids 


Buoyancy in fluids provides an example of the use of flux in physics. Consider a fluid of density p(z) 
in a gravitational field g(z) = —g(z)z where the z axis points in the opposite direction to the gravitational 
force. Pressure equals force per unit area and is a scalar quantity. For a conservative fluid system, in static 
equilibrium, the net work done per unit area for an infinitessimal displacement dr is zero. The net pressure 
force per unit area is the difference P(r +dr)— P(r) = VP -dr while the net change in gravitational potential 
energy is p(z)g(z)- dr. Thus energy conservation gives 


[VP + p(2)B(z)] - dr =0 


which can be expanded as 


dP 

= = olaa) (A) 
Z 

dP dP o 

de dy 


Integrating the net forces normal to the surface over any closed surface enclosing an empty volume, inside 
the fluid, gives a net buoyancy force on this volume that simplifies using the Divergence theorem 


fF ds=§ Pa8 -as =ẹ Pas = f (ear 
Enelosed dx dy dz 


Using equations A leads to the net buoyancy force 


dP 
fF HDS Te a ae ese plz)g(z)ar 


The right hand side of this equation equals minus the weight of the displaced fluid. That is, the buoyancy force 
equals the weight of the fluid displaced by the empty volume. Note that this proof applies both to compressible 
fluids, where the density depends on pressure, as well as to incompressible fluids where the density is constant. 
It also applies to situations where local gravity g is position dependent. If an object of mass M is completely 
submerged then the net force on the object is Mg — JEnctosed p(z)g(z)dr. If the object floats on the surface 
of a fluid then the buoyancy force must be calculated separately for the volume under the fluid surface and 


the upper volume above the fluid surface. The buoyancy due to displaced air usually is negligible since the 
density of air is about 1079 times that of fluids such as water. 


H.3 Stokes Theorem 
H.3.1 The curl 


Maxwell’s laws relate the circulation of the field around a closed loop to the rate of change of flux through 
the surface bounded by the closed loop. It is possible to write these integral equations in a differential form 
as follows. 

Consider the line integral around a closed loop C shown in figure H.3. 

If this area is subdivided into two areas enclosed by loops Cı and C2, then the sum of the line integrals 


is the same 
frase F-a- g F-dl (H.18) 
C Cy Ca 


because the contributions along the common boundary cancel since they are taken in opposite directions if 
Cı and C2 both are taken in the same direction. Note that the line integral, and corresponding enclosed area, 
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551 


are vector quantities related by the right-hand rule and this must be taken into account when subdividing 
the area. Thus the area can be subdivided into an infinite number of pieces for which 


N—=o00 


F - dl = F . dl = 
a 


N—=o0 ta, F. 
AS; - 


dl A 


n 


where AS; is the infinitessimal area bounded by the closed sub-loop C; and AS; - ñ is the normal component 
of this area pointing along the n direction which is the direction along which the line integral points. 


The component of the curl of the vector function along the di- 
rection ni is defined to be 


IF) a= Li S ULA H.20 
(cur ) -n = L1MAS=>0 - AS,- A ( E ) 
Thus the line integral can be written as 
Ci 5 
fra D a ras (H.21) 
= / [(curlF) - 1] dS; - 1 


The product n-n = 1, that is, this is true independent of the 
direction of the infinitessimal loop. Thus the above relation leads 
to Stokes Theorem 


f F-d= Lares (curlF) - dS 
C bounded 
by 
C 


(H.22) 


This relates the line integral to a surface integral over a surface 
bounded by the loop. 


H.3.2 Curl in cartesian coordinates 


Figure H.3: The circulation around a 
path is equal to the sum of the circu- 
lations around subareas made by sub- 
dividing the area. 


Consider the infinitessimal rectangle ArAy pointing in the k direction shown in figure H.4. 


The line integral, taken in a right-handed way around k gives 


F, 
OFy Ac) = (r J 
Ox 
Thus since ArAy = AS, the z component of the curl is given by 


Al 


(curlF) AS, ñ 
The same argument for the component of the curl in the y direction 
is given by 
> OF, OF: 
IF) -j= — 
(unl ( Oz Ox ) 


Similarly the same argument for the component of the curl in the x 
direction is given by 


OF, 
Oy 


f E- a= Fars (Fy + 
C 


(H.24) 


(H.25) 


OF. OF, 


Oy 


Ay) — FyAy = ( 


OF, as 


) 


Figure H.4: Circulation around an 
infinitessimal rectangle AxAy in the 
z direction. 
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Thus combining the three components of the curl gives 


curlF = OF, OF, A OF, OF: 4 OF, OF 7 
Oy Oz 


Oz ðr 


H.27 
Ox Oy ( ) 
Note that cross-product of the del operator with the vector F is 
îi j k 
= ð ə ð 
VxF=| 3 dy dz (H.28) 
Fz Fy F; 
which is identical to the right hand side of the relation for the curl in cartesian coordinates. That is; 
> 
V x F = curl F (H.29) 
Therefore Stokes Theorem can be rewritten as 
f F-dl= Lar (curlF) - dS = ase (V x F)-dsS (H.30) 
C bounded bounded 


The physics meaning of the curl is that it is the circulation, or rotation, for an infinitessimal loop at any 
location. The word curl is German for rotation. 


H.3 Example: Mazwell’s circulation equations 


As an example of the use of the curl, consider Faraday’s Law 


OB 
Perosa -d =- To Os 
loop 
C 


bounded Ot 
by 
Using Stokes Theorem gives 


f E-dl= founsac (V x E) -dS 
C 


bounded 
by 
C 
These two relations are independent of the shape of the closed loop, thus we obtain Faraday’s Law in the 
differential form 


A differential form of the Ampére-Mazwell law also can be obtained from 


; OE 
T -dl = Ho on y Eor) -dS 
loop by 
C C 
Using Stokes Theorem 


f B-di= foursac (Y x B) -dS 
C 


bounded 
by 
C 
Again this is independent of the shape of the loop and thus we obtain 
Ampere-Maxwell law in differential form 


ot 


The differential forms of Maxwell’s circulation relations are easier to apply than the integral equations 
because the differential form relates the curl to the time derivatives at the same specific location. 


V x B = Hoj + 080 
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H.4 Potential formulations of curl-free and divergence-free fields 


Interesting consequences result from the Divergence theorem and Stokes Theorem for vector fields that are 


either curl-free or divergence-free. In particular two theorems result from the second derivatives of a vector 
field. 


Theorem 1; Curl-free (irrotational) fields: 


For curl-free fields 


VxF=0 (H.31) 
everywhere. This is automatically obeyed if the vector field is expressed as the gradient of a scalar field 
F=V¢ (H.32) 
since 
Vx (V¢) =0 (H.33) 


That is, any curl-free vector field can be expressed in terms of the gradient of a scalar field. 

The scalar field ¢ is not unique, that is, any constant œ can be added to ¢ since Va = 0, that is, the 
addition of the constant a does not change the gradient. This independence to addition of a number to the 
scalar potential is called a gauge invariance discussed in chapter 13.2, for which 


F=V¢' =V(¢+a)=V¢ (H.34) 


That is, this gauge-invariant transformation does not change the observable F. The electrostatic field E 
and the gravitation field g are examples of irrotational fields that can be expressed as the gradient of scalar 
potentials. 


Theorem 2; Divergence-free (solenoidal) fields: 


For divergence-free fields 


V-F=0 (H.35) 


everywhere. This is automatically obeyed if the field F is expressed in terms of the curl of a vector field G 
such that 

F=VxG (H.36) 
since V- V x G = 0. That is, any divergence-free vector field can be written as the curl of a related vector 
field. 


As discussed in chapter 13.2, the vector potential G is not unique in that a gauge transformation can be 
made by adding the gradient of any scalar field, that is, the gauge transformation G’ = G + V¢ gives 


F=VxG =Vx(G+ Vo) =VxG. (H.37) 


This gauge invariance for transformation to the vector potential G’ does not change the observable vector 
field F. The magnetic field B is an example of a solenoidal field that can be expressed in terms of the curl 
of a vector potential A. 


H.4 Example: Electromagnetic fields: 


Electromagnetic interactions are encountered frequently in classical mechanics so it is useful to discuss 
the use of potential formulations of electrodynamics. 
For electrostatics, Maxwell's equations give that 


VxE=0 


Therefore theorem 1 states that it is possible to express this static electric field as the gradient of the scalar 
electric potential V, where 
E=-VV 
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For electrodynamics, Maxwell’s equations give that 


Assume that the magnetic field can be expressed in the terms of the vector potential B = V x A, then 


the above equation becomes 
V x (E+ =) =0 
oor ~ 


Theorem 1 gives that this curl-less field can be expressed as the gradient of a scalar field, here taken to 
be the electric potential V. 


E + —) == -VV 
E+E) 
that is R 
E = (VV + — 
(VV +) 
Gauss’ law states that 
V.E ms 
E0 
which can be rewritten as 
-A 
vV-E=_v?y —-2V-A) _ 4 (X) 
Ot E0 


Similarly insertion of the vector potential A in Ampére’s Law gives 


3 OE ; OV OPA 
Vx B= V x (V x A)=poj + Hoo = LMoJ—HoeoV (Z) — Hgo (52) 
Using the vector identity V x (V x A) = V (V - A) — V’ A allows the above equation to be rewritten as 
OPA OV . 
(V2A-Hoe (5) =y. (y : A+ HoE0 (5) = —HoJ (Y) 


The use of the scalar potential V and vector potential A leads to two coupled equations X and Y. These 
coupled equations can be transformed into two uncoupled equations by exploiting the freedom to make a gauge 
transformation for the vector potential such that the middle brackets in both equations X and Y are zero. 
That is, choosing the Lorentz gauge 
OV 
V - A = —Hoé0 (5) 


Ot 
simplifies equations X and Y to be 


OPV p 

2 

ee pee Ot? ~ E0 
OPA : 

V?’ A—uo£0 (Sz) = Hol 


The virtue of using the Lorentz gauge, rather than the Coulomb gauge V- A = 0, is that it separates the 
equations for the scalar and vector potentials. Moreover, these two equations are the wave equations for these 
two potential fields corresponding to a velocity c = FE This example illustrates the power of using the 


concept of potentials in describing vector fields. 


Appendix I 


Waveform analysis 


1.1 Harmonic waveform decomposition 


Any linear system that is subject to a time-dependent forcing function F(t), can be expressed as a linear 
superposition of frequency-dependent solutions of the individual harmonic decomposition a(w) of the forcing 
function. Similarly, any linear system subject to a spatially-dependent forcing function F(x) can be expressed 
as a linear superposition of the wavenumber-dependent solutions of the individual harmonic decomposition 
a(k.) of the forcing function. Fourier analysis provides the mathematical procedure for the transformation 
between the periodic waveforms and the harmonic content, that is, F(t) = a(w), or F(x) = a(k,). Fourier’s 
theorem states that any arbitrary forcing function F(t) can be decomposed into a sum of harmonic terms. 
For example for a time-dependent periodic forcing function the decomposition can be a cosine series of the 
form 


F(t) = Y an cos(nwot + bn) (1.1) 


where wo is the lowest (fundamental) frequency solution. For an aperiodic function a cosine decomposition 
can be of the form 


F(t) = f ns (19) 


Either of the complementary functions F(t) = a(w), or F(x) = a(kz) are equivalent representations of 
the harmonic content that can be used to describe signals and waves. The following two sections give an 
introduction to Fourier analysis. 


I.1.1 Periodic systems and the Fourier series 


Discrete solutions occur for systems when periodic boundary conditions exist. The response of periodic 
systems can be described in either the time versus angular frequency domains, or equivalently, the spatial 
coordinate x versus the corresponding wave number k,. For periodic systems this decomposition leads to 
the Fourier series where a generalized phase coordinate ¢ can be used to represent either the time or spatial 
coordinates, that is, with $ = wot or $ = k,x respectively. The Fourier series relates the two representations 
of the discrete wave solutions for such periodic systems. 

Fourier’s theorem states that for a general periodic system any arbitrary forcing function F'(¢) can be 
decomposed into a sum of sinusoidal or cosinusoidal terms. The summation can be represented by three 
equivalent series expansions given below, where $ = wot or $ = ko'r, and where wo, ko are the fundamental 
angular frequency and fundamental wave number respectively. 


f (8) = E +) lan cos (nd) + bn sin (nd)] (13) 
F (6) =F +) en cos (no + gpn) (1.4) 


n=0 
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f(b) =F + Yo du sin (ng + On) (1.5) 
n=0 


where n is an integer, and Y„,0n are phase shifts fit to the initial conditions. 
The normal modes of a discrete system form a complete set of solutions that satisfy the following orthog- 


onality relation 
277 


0 


where jn is the Kronecker delta symbol defined in equation (4.10). Orthogonality can be used to determine 
the coefficients for equations (1.3) to be 


wo = if "Pa (17) 
a = of 1 (8) 08 (nd) de (L8) 
= af E sin (nd) de (19) 


Similarly the coefficients for (1.4) and (1.5) are related to the above coefficients by 
È =d San + be 


Instead of the simple trigonometric form used in equations (1.3 — 1.5) the cosine and sine functions can 
be expanded into the exponential form where 


1 : ; 
cos = 5 (ett +e?) (1.10) 
sino = ls (ei? — ge) 
2 
then equation (1.3) becomes 
F= Y gne””? (1.11) 


where n is any integer and, from the orthogonality, the Fourier coefficients are given by 


1 f+" 
In == f (d) "°° do (1.12) 


2T Jor 
These coefficients are related to the cosine plus sine series amplitudes by 
(an — ibn) ( when n is positive) 


(an + ibn) (when n is negative) 


Dlr NI = 


These results show that the coefficients of the exponential series are in general complex, and that they 
occur in conjugate pairs (that is, the imaginary part of a coefficient an is equal but opposite in sign to that 
for the coefficient a-n). Although the introduction of complex coefficients may appear unusual, it should 
be remembered that the real part of a pair of coefficients denotes the magnitude of the cosine wave of the 
relevant frequency, and that the imaginary part denotes the magnitude of the sine wave. If a particular 
pair of coefficients a, and a_, are real, then the component at the frequency nwọ is simply a cosine; if a, 
and a-n are purely imaginary, the component is just a sine; and if, as is the general case, an and a-n are 
complex, both cosine and a sine terms are present. 

The use of the exponential form of the Fourier series gives rise to the notion of ‘negative frequency’. Of 
course, f (t) = a, cosw,t is a wave of a single frequency wn = nwo radians/second, and may be represented 
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by a single line of height an in a normal spectral diagram. However, using the exponential form of the Fourier 
series results in both positive and negative w components. 

The coexistence of both negative and positive angular frequencies +w can be understood by consideration 
of the Argand diagram where the real component is plotted along the x-axis and the imaginary component 
along the y-axis. The function g,,e**“* represents a vector of length gn that rotates with an angular velocity w 
in a positive direction, that is counterclockwise, whereas, q, e *“* represents the vector rotating in a negative 
direction, that is clockwise. Thus the sum of the two rotating vectors, according to equations (1.3), leads 
to cancellation of the opposite components on the imaginary y axis and addition of the two gn cos wt real 
components on the x axis. Subtraction leads to cancellation of the real z components and addition of the 
imaginary y axis components. 


1.1.2 Aperiodic systems and the Fourier Transform 


The Fourier transform (also called the Fourier integral) does for the non-repetitive signal waveform what 
the Fourier series does for the repetitive signal. It was shown that the line spectrum of a recurrent periodic 
pulse waveform is modified as the pulse duration decreases, assuming the period of the waveform (and hence 
its fundamental component) remains unchanged. Suppose now that the duration of the pulses remain fixed 
but the separation between them increases, giving rise to an increasing period. In the limit, only a single 
rectangular pulse remains, its neighbors having moved away on either side towards too. In this case, the 
fundamental frequency wo tends towards zero and the harmonics become extremely closely spaced and of 
vanishingly small amplitudes, that is, the system approximates a continuous spectrum. 

Mathematically, this situation may be expressed by modifications to the exponential form of the Fourier 
series already derived. Let the phase factor ¢ = wot in equation (1.11) then 


wo +r +3 


1 
Y ae t nwot ae nwot gq 11 
n=) 10era=] pera (1.13) 


E 
2 


where T is the period of the periodic force. Let G (w) = Tgn, w = nwo, and take the limit for rT —> oo, then 
equation (1.12) can be written as 


+oo 
auje f FA tdt (1.14) 


— 00 


Similarly making the same limit for 7 — 00 then wọ = % — dw and equation (1.11) becomes 


X GW) = w N 
a inwot __ O iwt _ iwt 
TO= Y =o 5 G(w) se = 5 E G (w) tdw (1.15) 
n=-—00 n=—00 
Equation (1.15) shows how a non-repetitive time-domain wave form is related to its continuous spectrum. 
These are known as Fourier integrals or Fourier transforms. They are of central importance for signal 
processing. For convenience the transforms often are written in the operator formalism using the F symbol 
in the form 


+o0 
f(t) = > Pedo F ow) (1.16) 
+o0 f 
Gy = Jl f (t)e"™at = F F(t) (L17) 


It is very important to grasp the significance of these two equations. The first tells us that the Fourier 
transform of the waveform f(t) is continuously distributed in the frequency range between w = +00, whereas 
the second shows how, in effect, the waveform may be synthesized from an infinite set of exponential functions 
of the form e+**, each weighted by the relevant value of G(w). It is crucial to realize that this transformation 
can go either way equally, that is, from G(w) to f (t) or vice versa.? 


1 The only asymmetry in the Fourier transform relations comes from the 27 factor originating from the fact that by convention 
physicists use the angular frequency w = 27v rather than the frequency v. In order to restore symmetry many papers use the 


factor — in both relations rather than using the x factor in equation J.16 and unity in equation 1.17. 


Wer 
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1.1 Example: Fourier transform of a single isolated square pulse: 


Consider a single isolated square pulse of width T that is described by the rectangular function II defined 


as 
= [t] < 
NY) -{ 0 > 


NIANIA 


That is, assume that the amplitude of the pulse is unity between —5 < t < 5 . Then the Fourier transform 


nate in YT 
G (w) = f Le dt =T = ) 


—T 2 


which is an unnormalized sinc(wr) function. Note that the width of the pulse At = +3 leads to a frequency 
envelope that has the first zeros at Aw = +%. Thus the product of these widths At. Aw = +r which is 
independent of the width of the pulse, that is Aw = 35 which is an example of the uncertainty principle 
which is applicable to all forms of wave motion. 


1.2 Example: Fourier transform of the Dirac delta function: 


The Dirac delta function, 6(t — t’), is a pulse of extremely short duration and unit area at t = t' and is 
zero at all other times. That is, 
+oo 
1 =| 6 (t —t’) dt 


— 00 


The Dirac function, which is sometimes referred to as the impulse function, has many important appli- 
cations to physics and signal processing. For example, a shell shot from a gun is given a mechanical impulse 
imparting a certain momentum to the shell in a very short time. Other things being equal, one is interested 
only in the impulse imparted to the shell, that is, the time integral of the force accelerating the shell in the 
gun, rather than the details of the time dependence of the force. Since the force acts for a very short time 
the Dirac delta function can be employed in such problems. 

As described in section 3.11 and appendix J, the Dirac delta function is employed in signal processing 
when signals are sampled for short time intervals. The Fourier transform of the delta function is needed for 
discussion of sampling of signals 


+o0 
Gl)= j 6(e— epee = nt 


Since e™™t essentially is constant over the infinitesimal time duration of the 6 (t — t') function, and the 
time integral of the 6 function is unity, thus the term e“? has unit magnitude for any value of w and has 
a phase shift of —w(t — t')radians. For t' = 0 the phase shift is zero and thus the Fourier transform of a 
Dirac 6(t) function is G(w) = 1. That is, this is a uniform white spectrum for all values of w. 


1.2 Time-sampled waveform analysis 


An alternative approach for unloosing periodic signals, that is complementary to the Fourier analysis har- 
monic decomposition, is time-sampled (discrete-sample) waveform analysis where the signal amplitude is 
measured repetitively at regular time intervals in a time-ordered sequence, that is, a sequence of samples of 
the instantaneous delta-function amplitudes is recorded. Typically an amplitude-to-digital converter is used 
to digitize the amplitude for each measured sample and the digital numbers are recorded; this process is 
called digital signal processing. 

The general principles are best explained by first considering the response of a linear system to a step 
function impulse, followed by a square impulse, and leading to the response of a ó-function impulsive driving 
force. 
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Figure 1.1: Response of a underdamped linear oscillator with w = 10, and T = 2 to the following impulsive 
force. (a) Step function force F = 0 for t < 0 and F = m for t > 0. (b) Square-wave force where F = m for 
0<t<v7 for 7 =3, and F = 0 at other times. (c) Delta-function impulse P = 1. 


1.2.1 Delta-function impulse response 


Consider the damped oscillator equation 


and assume that a step function is applied at time t = 0. That is; 
— =0 t<0 — =a t>0 (1.19) 


where a is a constant. The initial conditions are that x(0) = ¿(0) = 0. 
The transient or complementary solution is the solution of the linearly-damped harmonic oscillator 


ë +r + wir =0 (1.20) 


This is independent of the driving force and the solution is given in the chapter 3.5 discussion of the linearly- 
damped harmonic oscillator. 

The particular, steady-state, solution is easy to obtain just by inspection since the force is a constant, 
that is, the particular solution is 


a 
Ts= => t>0 ts =0 t<0 
wo 


Taking the sum of the transient and particular solutions, using the initial conditions, gives the final solution 


to be 


Te-5t 


x(t) = = i — e7 2! cos wyt — sin wt (1.21) 


Wo W1 

where wı = \/w2 — (oe This functional form is shown in figure J.la. Note that the amplitude of the 
transient response equals —a at t = 0 to cancel the particular solution when it jumps to +a. The oscillatory 
behavior then is just that of the transient response. 

A square impulse can be generated by the superposition of two opposite-sign stepfunctions separated by 
a time 7 as shown in figure /.1b. 

The square impulse can be taken to the limit where the width 7 is negligibly small relative to the response 
times of the system. It can be shown that letting T — 0, but keeping the magnitude of the total impulse 
P = ar finite for the impulse at time to, leads to the solution for the 6-function impulse occurring at to 

P = (=t des 
x(t) = —e 2" sinw: (t — to) t > to (1.22) 
WY 
This response to a delta function impulse is shown in figure J.1c for the case where tọ = 0. An example is 
the response when the hammer strikes a piano string at t = 0. 
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Figure 1.2: Decomposition of the function x(t) = 2 sin (t) +sin (5t) +3 sin (15t) +4 sin(25t) into a time-ordered 
sequence of 6-function samples. 


1.2.2 Green’s function waveform decomposition 


The response of the linearly-damped linear oscillator to an delta function impulse, that has been expressed 
above, can be used to exploit the powerful Green’s technique for decomposition of any general forcing 
function. That is, if the driven system is linear, then the principle of superposition is applicable and allowing 
expression of the inhomogeneous part of the differential equation as the sum of individual delta functions. 
That is; 


&+Te+uer= y HAO), y In (t) (1.23) 


n=—00 n=— 00 


As illustrated in figure 1.2 discrete-time waveform analysis involves repeatedly sampling the instantaneous 
amplitude in a regular and repetitive sequence of d-function impulses. Since the superposition principle 
applies for this linear system then the waveform can be described by a sum of an ordered series of delta- 
function impulses where t' is the time of an impulse. Integrating over all the 6-function responses that have 
occurred at time t’, that is prior to the time of interest t, leads to 


s= f PE) 50 sinw (t-t) t> (1.24) 


—oo MW1 


The Green’s function G (t — t’) is defined by 


G-t) = cd sinw (t-t) t >t (1.25) 


= 0 t<t 


Superposition allows the summed response of the system to be written in an integral form 


oo =f F(t')G(t — t')dt’ (1.26) 


which gives the final time dependence of the forced system. This repetitive time-sampling approach avoids 
the need of using Fourier analysis. Note that the Green’s function G (t — t’) includes implicitly the frequency 


of the free undamped linear oscillator wo, the free damped linear oscillator wı = 1/w% — a as well as the 
damping coefficient IT. Access to the combination of fast microcomputers coupled to fast digital sampling 
techniques has made digital signal sampling the pre-eminent technique for signal recording of audio, video, 


and detector signal processing. 
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about center of mass of uniform cube, 320 
about corner of uniform solid cube, 321 
characteristic (secular) equation, 318 
components, 316 
diagonalization, 318 
general properties, 323 
hula hoop, 323 
moments of inertia, 316 
parallel-axis theorem, 319 
perpendicular-axis theorem, 322 
plane laminae, 322 
principal axes, 317 
principal moments of inertia, 317 
products of inertia, 316 
thin book, 323 
Inertial frame, 10 
Galilean invariance, 465 
Inner product 
tensor algebra, 537 
tensors, 317 
Inverse variational calculus, 234 
Irrotational flow, 458 


Jacobi 
energy integral, 186 
history, 6 
Jacobi’s complete integral 
Hamilton-Jacobi theory, 422 
Jacobian 
example, 542 
general properties, 541 
transformation of differentials, 541 
transformation of integrals, 541 
Jacobian determinant, 541 


Kepler 

history, 2 

laws of plantary motion, 259, 284 
Kinetic energy 

generalized coordinates, 185 

scleronomic systems, 185, 186 
Kirchhoff’s rules, 67, 244 
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Kuramoto model 
coupled oscillators, 395 


Lagrange 
calculus of variations, 111 
history, 5 
Lagrange equations 
d’Alembert’s principle, 139 
Hamilton’s action principle, 225 
Hamilton’s principle, 141 
Lagrange multipliers, 142 
Lagrange equations 
generalized coordinates, 142 
Lagrange multipliers 
algebraic equations of constraint, 126 
Euler equations, 125 
integral equations of constraint, 128 
Lagrangian 
definition, 111 
equivalent lagrangians, 232 
extended formalism, 479 
non standard, 234, 238 
relativistic free particle, 482 
rotating frame, 293 
special relativity, 482 
standard, 232 
state space, 202 
time dependent, 169 
Lagrangian density, 448 
Lagrangian mechanics 
Atwoods machine, 151 


block sliding on moveable inclined plane, 153 
body on periphery of rolling wheel, 166 


central forces, 147 


comparison with Hamiltonian mechanics, 441 
comparison with Newtonian mechanics, 172 


cyclic coordinates, 184 

disk rolling on inclined plane, 148 
generalized coordinates, 125, 172 
holonomic constraints, 122, 144 
mass sliding on paraboloid, 158 
mass sliding on rotating rod, 154 
motion in gravitational field, 146 
motion of a free particle, 146 
non-conservative forces, 247 
partial holonomic systems, 161 
plane pendulum, 191 


solid sphere sliding on hemispherical surface, 165 
sphere rolling down inclined plane on fritionless 


floor, 154 
spherical pendulum, 155 
spring pendulum, 156 
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two masses sliding on inclined planes, 151 


unconstrained motion, 146 
velocity-dependent Lorentz force, 168 
yo-yo, 157 
Lame’s modulus of elasticity, 453 
Legendre transform 


Hamiltonian and Lagrangian mechanics, 200, 542 


Leibniz 
history, 3, 5 
vis viva, xviii 

Linear oscillator 
critically damped, 60 
driven, 62 
energy dissipation, 61 
linear damping, 58 
Lissajous figures, 55 
overdamped, 60 
Q factor, 61 
resonance, 65 


Steady state response of driven oscillator, 63 


superposition, 54 


transient response of driven oscillator, 62 


underdamped, 59 
Linear systems 
Fourier harmonic analysis, 70 
Linear velocity-dependent dissipation, 241 
Linearly-damped linear oscillator 
characteristic frequency, 58 
damping parameter, 58 
Liouville’s theorem 
phase space, 415 
Lissajous figure, 55 
Lorentz 
relativistic transformation, 467 
Lorentz force in electromagnetism 
Poisson brackets, 412 
Lorentz transformation 
Minkowski metric, 476 
Lyapunov exponent 
onset of chaos, 103 


Mach’s principle 
general theory of relativity, 488 
Many-body systems 
angular momentum, 16 
energy conservation, 18 
linear momentum, 14 
Mass 
gravitational, 39 
inertial, 38 
Matrix algebra, 505 
addition, 506 


swinging mass connected to a rotating mass, 159 
two connected blocks sliding without friction, 152 
two connected masses sliding on rigid rail, 160 


adjoint matrix, 507 
degenerate eigenvalues, 513 
diagonalization, 511 
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example of eigenvectors, 512 perturbation methods, 37 
Hermitian matrix, 507 position-dependent forces, 26 
history, 505 projectile motion, 29 
identity matrix, 507 rocket problem, 30 
inverse matrix, 507 roller coaster, 27 
matrix multiplication, 506 time-dependent forces, 34 
orthogonal matrix, 507 variable mass, 29 
scalar multiplication, 506 velocity-dependent forces, 28 
secular determinant, 511 vertical fall in graviatational field, 28 
transpose matrix, 507 Noether’s theorem 
unitary matrix, 508 Atwoods machine, 182 
Maupertuis conservation of angular momentum, 183 
action principle, 228 conservation of linear momentum, 182 
history, 5 diatomic molecule, 184 
Max Born, 8 history, 8 
history, 505 invariant transformations, 181 
quantum mechanics, 499 rotational invariance, 183 
Maxwell stress tensor, 455 symmetries and invariance, 179, 193 
Maxwell’s equations symmetry in deformed nuclei, 184 
Gauss’s law and flux, 549 translational invariance, 182 
Michelson and Morley experiment Non-conservative forces 
ether velocity, 466 projectile motion, 247 
Minkowski metric, 476 Rayleigh dissipation force, 241 
Minkowski space time Non-holonomic systems 
special relativity, 477 non-conservative forces, 247 
Modulus of elasticity velocity-dependent Lorentz force, 247 
bulk modulus, 453 Non-inertial frames 
Lame’s modulus, 453 centrifugal force, 294 
Poisson’s ratio, 454 Coriolis force, 294, 295 
shear modulus, 454 effective forces acting, 294 
Young’s modulus, 453 effective gravitation , 303 
Moment of inertia Foucault pendulum, 308 
thin door, 33 free fall on earth, 305 
Momentum horizontal motion on the earth, 305 
angular momentum, 11 Lagrangian and Hamiltonian, 293 
linear momentum, 9, 15 low-pressure systems, 306 
Multivariate calculus Newtonian mechanics, 292 
linear operators, 540 nucleon orbits in spheroidal potential well, 301 
partial differentiation, 539 pirouette, 298 
projectile fired vertically upwards, 305 
Navier-Stokes equation projectile motion near surface of earth, 302 
fluid flow, 460 Rossby number, 306 
Newton rotating frame, 290 
equations of motion, 24 rotation plus translation, 292 
history, 3 time derivatives for a rotating frame, 291 
laws of gravitation, 38 trajectories for free motion on earth, 304 
laws of motion, 9 translation, 289 
Principia, xviii, 3 transverse, azimuthal, force, 294 
Newton’s laws of gravitation, 44 weather systems, 306 
Newtonian mechanics Non-linear systems 
conservative forces, 25 bifurcation, 92, 103 
constant force problems, 24 driven damped plane pendulum, 97 
constrained motion, 27 limit cycle, 93 
diatomic molecule, 26 onset of chaos, 101 


linear restoring force, 25 period doubling, 100 
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point attractor, 92 
sensitivity to initial conditions, 101 
soliton, 107 
turbulence in fluid flow, 460 
van der Pol oscillator, 94 
weak non-linearity, 90 
Norbert Wiener 
quantum mechanics, 499 
Normal modes, 365 


Orbit equation 
differential orbit equation, 254 
free body motion, 254, 256 
Orbit stability 
Bertrand’s theorem, 263 
constant restoring force, 271 
Hooke’s law restoring force, 268 
inverse square law, 269 
two-body motion, 267 


Parallel-axis theorem 
inertia tensor, 319 
Pascal 
history, 3 
Pauli exclusion principle 
quantum physics, 496 
Pendulum 
Foucault, 308 
plane, 57, 206 
plane pendulum, 191 
spherical, 155, 213 
spring pendulum, 156 
Permutation symbol, 516 
Perpendicular axis theorem 
inertia tensor, 322 
Phase space 
harmonic oscillator, 56 
Liouville’s theorem, 415 
Phase velocity 
wave packets, 105 
wavepackets, 73, 74 
Philosophical developments, xix 
Photoelectric effect 
Einstein, 494 
Millikan, 494 
Planck 
constant, 493 
history, 493 
Plane pendulum 
state space, 57 
Plato 
history, 1 
Poincare 
chaos, 89 
history, 7 
three-body problem, 89 
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Poincare sections 
state-space plots, 104 
Poincare-Bendixson theorem 
non-linear systems, 93 
Poisson 
history, 6 
Poisson brackets 
angular momentum conservation, 409 
canonical transformation, 406 
commutation relation, 407, 497 
definition, 405 
fundamental, 405 
Hamilton equations of motion, 411 
invariance to canonical transformations, 406 
Lorentz force in electromagnetism, 412 
time dependence, 408 
two-dimensional oscillator, 413 
wave motion and uncertainty principle, 412 
Poisson’s ratio, 454 
Potential theory 
gravitation, 41 
Precession rate 
inertially-symmetric rigid rotor, 340 
Principle of covariance 
general theory of relativity, 488 
Principle of equivalence 
weak principle, 39 
Principle of minimal gravitational coupling, 489 


Q-factor 

damped linear oscillator, 61 
Quantum mechanics 

Heisenberg, 496 

Heisenberg’s matrix representation, 497 

Max Born, 499 

Norbert Wiener, 499 

Paul Dirac, 408, 497 

Pauli exclusion principle, 496 

Schrodinger, 496 

Schrodinger wave mechanics, 499 
Queen Dido’s problem, 129 


Radius of gyration, 36 
Rayleigh dissipation function, 241 
Rayleigh’s dissipation function 
Hamiltonian mechanics, 242, 248 
Ohm's law, 244 
Reduced mass 
two-body motion, 251 
Refractive index, 106 
Relativistic Doppler effect 
special theory of relativity, 471 
Relativistic four vector 
scalar product, 476 
Restricted holonomic systems 
mass sliding on hemispherical shell, 161 
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sphere rolling on a hemispherical shell, 163 
Reynolds number 
fluid flow, 461 
laminar flow, 462 
turbulent flow, 462 
Rheonomic constraint, 124 
Riemannian geometry, 489 
Rigid-body rotation 
about a body-fixed non-symmetry axis, 32 
about a body-fixed point, 314 
about a point, 313 
about body-fixed symmetry axis, 31 
about fixed axis, 313 
Androyer-Deprit variables, 337 
angular momentum, 325 
angular momentum about corner of a uniform 
cube, 326 
angular momentum of cube about centre of mass, 
325 
angular velocities in terms of the Euler angle ve- 
locities, 333 
billiards, 33 
body-fixed axis, 31 
Chasles’ theorem, 314 
Euler equations for torque-free motion, 336 
Euler’s equations of motion, 334 
Hamiltonian approach, 337 
inertia tensor, 316 
kinetic energy, 327 
kinetic energy in terms of Euler angular veloci- 
ties, 332 
matrix formulation, 317 
nutation, 331 
parallel-axis theorem, 319 
pivoting versus rolling, 354 
precession, 331 
rolling, 354 
rotating dumbbell, 336 
spin, 331 
stability for torque-free motion, 344 
stability of a rolling wheel, 353 
static and dynamic balancing, 355 
symmetric top about a fixed point, 347 
torque-free rotation of symmetric top, 337 
Rigid-body rotation about a point 
tippe top, 350 
Rolling wheel 
symmetric rigid-body rotation , 351 
Rotation matrix, 525 
example, 527 
finite rotations, 528 
infinitessimal rotations, 529 
proper and improper rotations, 529 
Rotational invariants 
scalar products, 333 
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Rotational transformation 
rotation matrix, 525 
Routh 
Routhian reduction, 210 
Routhian reduction, 210 
cyclic and non-cyclic Routhians, 299 
inverse-square central potential, 216 
non-cyclic Routhian, 212 
rotating frames, 299 
rotation of a symmetric top about a fixed point, 
348 
Routhian, 210 
spherical pendulum, cyclic Routhian, 214 
spherical pendulum, non-cyclic Routhian, 215 
Routhian reduction 
cyclic Routhian, 211 
Rutherford scattering, 274 
cross section, 276 
distance of closest approach, 276 
impact parameter, 275 


Scattering 
energy transfer, 36 
Schrodinger 
history, 8 
Schrodinger equation 
Hamilton-Jacobi equation, 500 
Schrodinger wave mechanics 
quantum mechanics, 499 
Scleronomic constraint, 124, 143, 185, 186, 369 
Shear modulus of elasticity, 454 
Signal processing 
coaxial cable, 72 
discrete-function analysis, 558 
Signal velocity 
wave packets, 73, 105 
Simultaneity 
Special theory of relativity, 469 
Slow light 
wave packets, 106 
Snell’s law, xvii 
Soliton 
non-linear systems, 107 
Soliton wave, 107 
Sommerfeld 
history, 8 
Sommerfeld atom 
quantum of action, 495 
Spatial inversion transformation, 530 
Special theory of relativity 
Bohr-Sommerfeld atom, 487 
energy, 473 
extended Hamiltonian formalism, 481, 484 
Extended Lagrangian formulation, 479 
force, 473 
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four-dimensional space-time, 475 
Lagrangian, 482 
Lorentz spatial contraction, 469 
Minkowski space, 477 
momentum transformations, 472 
momentum-energy four vector, 478 
relativistic Doppler effect, 471 
simultaneity, 469 
time dilation, 468 
twin paradox, 471 
velocity transformations, 472 
Spherical coordinates 
Hamiltonian, 204 
Spherical harmonic oscillator 
two-body force, 263 
Spherical pendulum 
Hamiltonian mechanics, 213 
Lagrangian mechanics, 155 
Spring constant, 454 
Standard Lagrangian, 232 
State space 
Lagrangian mechanics, 202 
plane pendulum, 57 
State-space orbits 
Poincare sections, 104 
Stern Gerlach 
space quantization, 496 
Strain tensor 
elasticity, 452 
Stress tensor 
elasticity, 452 
Strong equivalence principle 
general theory of relativity, 488 
Superposition 
Fourier series, 555 
harmonic wave analysis, 70 
linear equation of motion, 54 
Symmetric top 
Feynman’s wobbling plate, 342 
nutation , 349 
oblate spheroid, 339 
precession, 349 
precession rate for torque-free symmetric top, 
342 
prolate spheroid, 339 
rotation about a fixed point, 347 
spin, 350 
spinning jack, 349 
torque-free rotation, 337 
Symmetries 
invariance, 193 
Noether’s theorem, 181 
Symmetry tensor 
anisotropic harmonic oscillator, 414 
isotropic harmonic oscillator, 266 
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Poisson Brackets, 414 


Teleology, 5, 228 
Tennis racket rotation 

asymmetric-rotor rotation, 345 
Tensor algebra 

contravariant tensor, 536 

covariant tensor, 536 

inner product, 317, 533, 537 

outer product, 534 

transformation properties, 538 
Three-body problem 

Lagrange points, 272 

planar approximation, 272 

restricted 3-body problem, 272 
Time dependent force 

nonautonomous systems, 169 
Time invariance 

conservation of energy, 186 
Time reversal transformation, 531 
Tippe top 

symmetric rigid-body rotation about a point, 350 
Tornadoes 

weather systems, 307 
Torque free rotation of asymmetric body, 346 
Total mechanical energy, 19 
Transformation properties of common observables, 538 
Translational invariance 

Noether’s theorem, 182 
Tumbling of an asymmetric rotor 

rigid-body rotation, 356 
Turbulence in fluid flow 

non-linear system, 460 
Twin paradox 

special theory of relativity, 471 
Two-body central forces 

conservative forces, 249 
Two-body kinematics, 278 

angle transformation, 280 

recoil energies, 282 

velocity transformation, 279 
Two-body motion 

angular momentum, 251 

apocenter, 259 

barycenter, 251 

bound orbits, 258 

equations of motion, 253 

equivalent one-body representation, 250 

Hamiltonian, 255 

inverse cubic central force, 270 

inverse square law, 257 

isotropic harmonic oscillator, 263 

Kepler’s laws, 259, 284 

Laplace-Runge-Lenz vector, 261 

orbit solutions , 256 
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orbit stability, 267 
pericenter, 258 
properties of objects in solar system, 260 
reduced mass, 251 
unbound orbits, 260 
Two-body scattering 
differential cross section, 274 
impact parameter, 260 
Rutherford scattering, 274 
total cross section, 273 
Two-coupled harmonic oscillators 
centre-of-mass oscillations, 366 
eigenfrequencies, 364 
grand piano, 368 
normal modes, 363, 365 
symmetric and antisymmetric normal modes, 366 
weak coupling, 367 


Uncertainty principle 

Heisenberg, 80 

quantum baseball, 82 
Uncertainty principle for wave motion, 80 
Unity of classical and quantum mechanics, 504 


van der Pol oscillator 
attractor, 94 
strong non-linearity, 96 
weak non-linearity, 95 
Variational principles 
calculus of variations, 111 
philosophy, 111, 504 
principle of economy, 111, 504 
Vector algebra 
linear operations, 515 
Vector differential calculus 
scalar differential operator, 543 
scalar differential operators, 543 
Vector differential operators 
curl, 546 
curvilinear coordinates, 545 
divergence, 546 
gradient, 544, 545 
Laplacian, 545, 546 
scalar product, 544 
vector product, 544 
Vector integral calculus 
curl, 551 
curl in cartesian coordinates, 551 
curl-free field, 553 
divergence in cartesian coordinates, 548 
divergence theorem, 548 
divergence-free field, 553 
Gauss’s theorem, 547 
line integral, 547 
Stokes theorem, 550 
Vector multiplication 
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scalar product, 515 

scalar triple product, 517 

vector product, 516 

vector triple product, 518 
Vibration isolation 

linearly-damped oscillator, 71 
Virial theorem, 22 

Hooke’s law, 22 

ideal gas law, 23 

inverse square law, 23 

mass of galaxies, 23 
Virtual work 

d’Alembert’s principle, 138 

principle, 138 


Wave equation, 68 
stationary wave solutions, 69 
trabelling wave solutions, 69 
Wave motion 
discrete-function analysis, 558 
dispersion on discrete lattice chain, 393 
electromagnetic waves in ionosphere, 77 
group velocity for discrete lattice chain, 393 
group velocity for water waves, 76 
group velocity of de Broglie waves, 496 
plasma oscillation frequency, 78 
uncertainty principle, 80 
water waves breaking on a beach, 76 
Wave packets 
fast light, 106 
Fourier transform, 79 
group velocity, 73, 74, 105 
phase velocity, 73, 74, 105 
signal velocity, 73, 105 
slow light, 106 
uncertainty principle, 80 
Wave-particle duality 
de Broglie, 496, 499 
Hamilton-Jacobi theory, 432 
Schrodinger, 499 
Weak equivalence principle 
general theory of relativity, 488 
Weather systems 
high-pressure systems, 308 
low-pressure systems, 306 
tornadoes, 307 
Work 
definition, 12, 20 


Young’s modulus of elasticity, 453 


Zeeman effect 
weakly-coupled normal modes, 367 


Two dramatically different philosophical approaches to classical mechanics 
were proposed during the 17th — 18th centuries. Newton developed his vectorial 
formulation that uses time-dependent differential equations of motion to relate 
vector observables like force and rate of change of momentum. Euler, Lagrange, 
Hamilton, and Jacobi, developed powerful alternative variational formulations 
based on the assumption that nature follows the principle of least action. These 
variational formulations now play a pivotal role in science and engineering. 


This book introduces variational principles and their application to classical 
mechanics. The relative merits of the intuitive Newtonian vectorial formulation, 
and the more powerful variational formulations are compared. Applications to 
a wide variety of topics illustrate the intellectual beauty, remarkable power, and 
broad scope provided by use of variational principles in physics. 


This second edition adds discussion of the use of variational principles applied 
to the following topics: 


(1) Systems subject to initial boundary conditions 

(2) The hierarchy of related formulations based on action, Lagrangian, 
Hamiltonian, and equations of motion, to systems that involve symmetries 

(3) Variational principles to non-conservative systems 

(4) Variable-mass systems 

(5) The General Theory of Relativity 


Douglas Cline is a Professor of Physics in the Department of Physics and 
Astronomy, University of Rochester, Rochester, New York. 
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